A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass for the task of video analysis, content-based video understanding. In this paper, a novel scheme based on improved two-dimensional entropy is proposed to complete the partition of video shots. Firstly, shot transition candidates are detected using a two-pass algorithm: a coarse searching pass and a fine searching pass. Secondly, with the character of two-dimensional entropy of the image, correctly detected transition candidates are further classified into different transition types whereas those falsely detected shot breaks are distinguished and removed. Finally, the boundary of gradual transition can be precisely located by merging the characters of two-dimensional entropy of the image into the gradual transition. A large number of video sequences are used to test our system performance and promising results are obtained.
A new calibration algorithm for multi-camera systems using 1D calibration objects is proposed. The algorithm inte- grates the rank-4 factorization with Zhang (2004)'s method. The intrinsic parameters as well as the extrinsic parameters are re- covered by capturing with cameras the 1D object's rotations around a fixed point. The algorithm is based on factorization of the scaled measurement matrix, the projective depth of which is estimated in an analytical equation instead of a recursive form. For more than three points on a 1D object, the approach of our algorithm is to extend the scaled measurement matrix. The obtained parameters are finally refined through the maximum likelihood inference. Simulations and experiments with real images verify that the proposed technique achieves a good trade-off between the intrinsic and extrinsic camera parameters.
Automatic recognition of artists is very important in acoustic music indexing, browsing, and contentbased acoustic music retrieving, but synchronously it is still a challenging errand to extract the most representative and salient attributes to depict diversiform artists. In this paper, we developed a novel system to complete the reorganization of artist automatically. The proposed system can efficiently identify the artist's voice of a raw song by analyzing substantive features extracted from both pure music and singing song mixed with accompanying music. The experiments on different genres of songs illustrate that the proposed system is possible.
A closed form solution to the problem of segmenting multiple 3D motion models was proposed from straight-line optical flow. It introduced the multibody line optical flow constraint (MLOFC), a polynomial equation relating motion models and line parameters. The motion models can be obtained analytically as the derivative of the MLOFC at the corresponding line measurement, without knowing the motion model associated with that line. Experiments on real and synthetic sequences were also presented.
A new method to reconstruct 3D scene points from nonparallel stereo is proposed. From a pair of conjugate images in an arbitrarily configured stereo system that has been calibrated, coordinates of 3D scene points can be computed directly using the method, bypassing the process of rectifying images or iterative solution involved in existing methods. Experiment results from both simulated data and real images validate the method. Practical application to surgical navigator shows that the method has advantages to improve efficiency and accuracy of 3D reconstruction from nonparallel stereo system in comparison with the conventional method that employs algorithm for standard parallel axes stereo geometry.