Kursen Projektkurs i multimedia-signalbehandling EN2600

Nyhetsflöde

Logga in till din kurswebb

Du är inte inloggad på KTH så innehållet är inte anpassat efter dina val.

Har du frågor om kursen?

Om du är registrerad på en aktuell kursomgång, se kursrummet i Canvas. Du hittar rätt kursrum under "Kurser" i personliga menyn.

Är du inte registrerad, se Kurs-PM för EN2600 eller kontakta din studentexpedition, studievägledare, eller utbilningskansli.

I Nyhetsflödet hittar du uppdateringar på sidor, schema och inlägg från lärare (när de även behöver nå tidigare registrerade studenter).

April 2013

Visa tidigare händelser (2)

Pravin Kumar Rana redigerade 29 april 2013

Activity Type Title Date and Place* Time Seminar

Kick-off meeting

Introduction (pdf)

March 18, 2013

A367, Floor 3, Osquldas väg 10

13:00

Seminar

Project plan presentation

April 10, 2013

A367, Floor 3, Osquldas väg 10

13:00

Deadline

Submission of project plan

April 13, 2013

23:59

Seminar

Project progress presentation

~~April 30~~May 02, 2013

B356, Floor 3, Osquldas väg 10

130:00

Deadline

Submission of project report

May 26, 2013

23:59

Seminar

Final project presentation

May 28, 2013

A367, Floor 3, Osquldas väg 10

13:00

*NOTE: The seminar meetings will take place in the SIP conference room no.: A367, Osquldas väg 10, Plan 3. The exception is the meeting on April 30 which will take place in the Admin conference, room no.: B356, Osquldas väg 10, Plan 3.

Mars 2013

Visa tidigare händelser (1)

Pravin Kumar Rana redigerade 28 mars 2013

Templates EN2600 Project Plan Template EN2600 Project Report Template Presentation Template ¶

Project Resources: 1: Mobile Visual Search using Stereo Features Software of Feature Extraction¶

Stockholm Building Data Set¶

Visa tidigare händelser (1)

Pravin Kumar Rana redigerade 14 mars 2013

~~This page will be updated soon~~Estimation of Volumetric Depth from Multiview Video Capturing multiple images of the same object from different viewpoints permits the estimation of the underlying 3D geometry of the object. Widely used are so-called depth maps that indicate the shortest distance from the camera plane to the object for each pixel in the image. With multiple depth maps from different viewpoints, we have multiple estimates of depth for a same 3D point of the object. In general, these multiple depth values for a unique 3D point of the object are not consistent [1], and result in an ambiguous description of the underlying 3D geometry.¶

The multiple depth measurements can be used to construct a volumetric description of the underlying geometry of the 3D object [2]. However, this description will not be unique due to the inconsistent depth measurements. Hence, it is a challenge to estimate the volumetric depth in each time instance from multiview video. With multiview video, we have not only the neighboring views at the current time instance, but also neighboring views at past and future time instances. Hence, it is desirable to develop estimation techniques for volumetric depth that use both neighboring views and multiple time instances.¶

In this project, a group of students will work to design, implement, and evaluate algorithms for estimating volumetric depth data. The design shall accomplish accurate estimates of the volumetric depth. The quality will be assessed by extracting enhanced depth maps from the estimated volumetric depth and by comparing them to the provided ground truth depth maps. The implementation will be evaluated for given data sets. The results will be used to assess your achieved design goals.¶

For more information on projects in this area, please forward your inquiries to Markus Flierl or Pravin K. Rana.¶

References: [1] P.K. Rana and M. Flierl, Depth Consistency Testing for Improved View Interpolation, Proc. IEEE International Workshop on Multimedia Signal Processing, Saint-Malo, France, Oct. 2010.¶

[2] S. Parthasarathy, A. Chopra, E. Baudin, P.K. Rana, and M. Flierl, Denoising of Volumetric Depth Confidence for View Rendering, Proc. 3DTV-Conference, Zurich, Switzerland, Oct. 2012.¶

¶

Mobile Visual Search using Stereo Features Visual search allows users interactive and semantic access to real-world objects. With the integration of digital cameras into mobile devices, image-based information retrieval for mobile visual search [1] is developing rapidly. The challenges of mobile image retrieval are rooted in the bandwidth constraint and the limited computational capability of mobile devices.¶

Usually, one query image is used for mobile visual search and the well-known Scale Invariant Feature Transform (SIFT) [2] is utilized to extract relevant image features. The server receives the extracted and encoded image features from the mobile device, decodes them, and matches them with the stored image features of the image database. For matching, the descriptor vector of each image feature is used. The efficiency of matching is usually improved by random sample consensus (RANSAC) [3].¶

To improve the success rate of visual search, i.e. the rate of retrieval, we allow queries that use a pair of stereo images to be matched to a set of multiview images of the same object at the server. For that, we extract from the stereo pair so-called stereo features. Now, the challenge is to use the stereo features efficiently to maximize the rate of retrieval.¶

In this project, a group of students will work to design, implement, and evaluate algorithms for mobile visual search using stereo features. Using our dataset “Stockholm Buildings”, the rate of building retrieval shall be maximized by using stereo feature queries. The implementation will be evaluated for a given test dataset. The results will be used to compare to the rate of retrieval when using monocular query images only.¶

For more information on projects in this area, please forward your inquiries to Markus Flierl or Haopeng Li.¶

References: [1] B. Girod, V. Chandrasekhar, D.M. Chen, N.M. Cheung, R. Grzeszczuk, Y. Reznik, G. Takacs, S.S. Tsai and R. Vedantham, Mobile Visual Search, IEEE Signal Processing Magazine, vol. 28, no. 4, pp. 61-76, July 2011.¶

[2] D. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, vol. 60(2), pp. 91–110, 2004.¶

[3] M. Fischler and R. Bolles, Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM, vol. 24, no. 6, pp. 381–395, June 1981.¶