Scale-Space Theory

Scale-space theory with applications: Selected publications sorted by subject

General

Lindeberg (2021) "Normative theory of visual receptive fields", Heliyon, 7(1): e05897: 1-20. (Download PDF)
Lindeberg (2014) "Scale selection", Computer Vision: A Reference Guide, (K. Ikeuchi, ed.), pages 701-713. (PDF 2.3 Mb) (Updated 2nd edition 2021)
Lindeberg (2013) ``Generalized axiomatic scale-space theory'', Advances in Imaging and Electron Physics 178: 1-96. (PDF 20.1 Mb)
Lindeberg (2008) "Scale-space", Encyclopedia of Computer Science and Engineering, (B. Wah, ed), volume IV: 2495-2504, Jan 2009. dx.doi.org/10.1002/9780470050118.ecse609 (Sep 2008) (PDF 1.2 Mb)
Lindeberg (1993) Scale-Space Theory in Computer Vision, Springer. (Online edition at SpringerLink)
Lindeberg (1991) Discrete Scale-Space Theory and the Scale-Space Primal Sketch, PhD thesis, Department of Numerical Analysis and Computer Science, KTH Royal Institute of Technology.
A concise one-page illustration of some of the most basic ideas with a scale-space representation,

Review articles

Lindeberg (2012) `` Scale invariant feature transform'', Scholarpedia, 7(5): 10491, 2012. (Online version) (SIFT)
Lindeberg (1999) `` Principles for automatic scale selection'', Handbook on Computer Vision and Applications, (B. Jähne et al., eds.), volume 2: 239-274. (PDF 2.3 Mb)
Lindeberg (1999) "Automatic scale selection as a pre-processing stage for interpreting the visual world", Proc. Fundamental Structural Properties in Image and Pattern Analysis FSPIPA'99, (Budapest, Hungary), September 1999. Schriftenreihen der Österreichischen Computer Gesellschaft, volume 130: 9-23. (PDF 2.9 Mb)
Lindeberg (1996) ``Scale-space: A framework for handling image structures at multiple scales'', Proc. CERN School of Computing, (Egmond aan Zee, The Netherlands), 96(8): 27-38, September 1996. (PDF 2.0 Mb)
Lindeberg (1994) ``Scale-space theory: A basic tool for analysing structures at different scales'', Journal of Applied Statistics 21(2): 224-270. (PDF 933 kb)
Lindeberg and ter Haar Romeny (1994) ``Linear scale-space: (I) Basic theory and (II) Early visual operations.'' In: ter Haar Romeny (ed.) Geometry-Driven Diffusion, pages 1-77, Springer. (PDF 1.1Mb)

Basic theory of scale-space representation

Axiomatic theories for continuous and discrete scale-space as well as foveal scale-space. General theoretical framework for modelling the deep structure of how image features are related over scales and forhow to measure the lifelength of image structures over scales with general validity for both continuous and discrete signals.

Lindeberg (2025) ”Approximation properties relative to continuous scale space for hybrid discretizations of Gaussian derivative operators”, Frontiers in Signal Processing, 4: 144784: 1-12. (Download PDF)
Lindeberg (2024) "Discrete approximations of Gaussian smoothing and Gaussian derivatives", Journal of Mathematical Imaging and Vision 66(5): 759–800. (Download PDF)
Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision 55(1): 50-88. (PDF 4.7 Mb) (Video recording of survey talk about this topic)
Lindeberg (2011) ``Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space'', Journal of Mathematical Imaging and Vision 40(1): 36-81. (Shorter journal version 47 pages 17.3 Mb) (Longer technical report version 76 pages 17.4 Mb)
Lindeberg (1997) ``On the axiomatic foundations of linear scale-space: Combining semi-group structure with causality vs. scale invariance''. Technical report ISRN KTH NA/P--93/18--SE. Shortened version published as Chapter 6 in J. Sporring, M. Nielsen, L. Florack, and P. Johansen (eds.) Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory , (Copenhagen, Denmark, May 1996), pages 75-98, Kluwer Academic Publishers/Springer, 1997. (PDF 394 kb)
Lindeberg and Florack (1994) ``Foveal scale-space and the linear increase of receptive field size as a function of eccentricity'', Technical report ISRN KTH NA/P--94/27--SE, (PDF 378 kb)
Lindeberg (1993) ``Discrete derivative approximations with scale-space properties: A basis for low-level feature detection'', Journal of Mathematical Imaging and Vision 3(4): 349-376, 1993. (PDF 203 kb)
Lindeberg (1993) ``Effective scale: A natural unit for measuring scale-space lifetime'', IEEE Transactions on Pattern Analysis and Machine Intelligence 15(10): 1068-1074, 1993. (PDF 272kb) (figures in PDF 182kb)
Lindeberg (1992) ``Scale-space for N-dimensional discrete signals'', Proc. Workshop on Shape in Picture, (Driebergen, Netherlands,) September 1992. In: Y. O. Ying (ed.) Shape in Picture, NATO ASI Series F: 571-590, Springer-Verlag, 1994. (PDF 256kb)
Lindeberg (1992) ``Scale-space behaviour and invariance properties of differential singularities'', Proc. Workshop on Shape in Picture, (Driebergen, Netherlands), September 1992. In: Y. O. Ying (ed.) Shape in Picture, NATO ASI Series F: 591-600, Springer-Verlag, 1994. (PDF 0.2Mb)
Lindeberg (1992) ``Scale-space behaviour of local extrema and blobs'', Journal of Mathematical Imaging and Vision 1(1): 65-99. Only a condensed, substantially shortened, version available here (PDF 407kb)
Lindeberg (1990) ``Scale-space for discrete signals'', IEEE Transactions of Pattern Analysis and Machine Intelligence 12(3): 234-254. (PDF 421 kb)

Computational modelling of visual receptive fields

Cell recordings of neurons in the primary visual cortex (V1) have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time. We show how such families of idealized receptive field profiles can be derived mathematically from a small set of basic assumptions that correspond to structural properties of the environment. We also show how basic invariance properties of a visual system can be obtained already at the level of receptive fields, and that we can explain the different shapes of receptive field profiles found in biological vision from a requirement that the visual system should be able to be covariant or invariant to the natural types of image transformations that occur in the environment.

Lindeberg (2025) "Direction and speed selectivity properties for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields", arXiv preprint arXiv:2511.08101. (Download PDF)
Lindeberg (2025) "Hybrid Lie semi-group and cascade structures for the generalized Gaussian derivative model for visual receptive fields", arxiv preprint arXiv:2509.15748. (Download PDF)
Lindeberg (2025) "On sources to variabilities of simple cells in the primary visual cortex: A principled theory for the interaction between geometric image transformations and receptive field responses", arXiv preprint arXiv:2509.02139. (Download PDF)
Lindeberg (2025) "Orientation selectivity properties for integrated affine quasi quadrature models of complex cells", PLOS One 20(9): e0332139:1-25. (Download PDF)
Lindeberg (2025) "Unified theory for joint covariance properties under geometric image transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields", Journal of Mathematical Imaging and Vision, 67(4):44: 1-49. (Download PDF)
Lindeberg (2025) "Relationships between the degrees of freedom in the affine Gaussian derivative model for visual receptive fields and 2-D affine image transformations, with application to covariance properties of simple cells in the primary visual cortex", Biological Cybernetics, 119(2-3):15: 1-25. (Download PDF)
Lindeberg (2025) "Do the receptive fields in the primary visual cortex span a variability over the degree of elongation of the receptive fields?", Journal of Computational Neuroscience, 53(3): 397-417. (Download PDF)
Lindeberg (2025) "Orientation selectivity properties for the affine Gaussian derivative and the affine Gabor models for visual receptive fields", Journal of Computational Neuroscience, 53(1): 61-98. (Download PDF)
Lindeberg (2023) "Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields", Frontiers in Computational Neuroscience 17: 1189949: 1-23. (Download PDF)
Lindeberg (2021) "Normative theory of visual receptive fields", Heliyon 7(1): e05897: 1-20. (Download PDF)
Lindeberg (2013) ``Invariance of visual operations at the level of receptive fields'', PLOS ONE 8(7): e66990:1-33, preprint at arXiv:1210.0754. (PDF 11.9 Mb)
Lindeberg (2013) ``A computational theory of visual receptive fields'', Biological Cybernetics 107(6): 589-635. (PDF 6.8 Mb)

Computational modelling of auditory receptive fields

A scale-space theory is developed for auditory signals, showing how temporal and spectro-tempioral receptive fields can be derived by necessity and with good qualitative similarity to biological receptive fields in the inferior colliculus (ICC) and primary auditory cortex (A1).

Lindeberg and Friberg (2015) ``Scale-space theory for auditory signals", Proc. SSVM 2015: Scale-Space and Variational Methods in Computer Vision, (Lege-Cap Ferret, France), May 31 - June 4, 2015, Springer LNCS 9087: 3-15. (PDF 1.3 Mb)
Lindeberg and Friberg (2015) ``Idealized computational models for auditory receptive fields", PLOS ONE 10(3): e0119032:1-58. (PDF 4.0 Mb)
Lindeberg (2025) "A time-causal and time-recursive analogue of the Gabor transform", IEEE Transactions on Information Theory, 71(2): 1450-1480. (Download PDF)

Feature detection, automatic scale selection and scale-invariant image features

Feature detection methods based on the combination of Gaussian derivative operators at multiple scales. Special focus is given to the problem of scale selection, in order to adapt the local scales of processing to the local image structure. Specifically, the notion of automatic scale selection based on local extrema over scales of gamma-normalized derivatives makes it possible to define scale-invariant image features. The use of such scale-invariant image features allows the vision system to automatically handle the unknown scale variations that may occur in real-world image data, due to objects of different physical size as well as objects with different distances to the camera.

This theory, which includes the definition of scale-invariant feature detectors from scale-space extrema of the scale normalized Laplacian and the scale normalized determinant of the Hessian, constitutes the theoretical basis for the scale-invariant properties of the SIFT and SURF descriptors. The differences-of-Gaussians operator in the SIFT descriptor can be seen as an approximation of the scale normalized Laplacian and the blob detector in the SURF descriptor can be seen as an approximation of the scale-normalized determinant of the Hessian, with the underlying second-order Gaussian derivative operators replaced by Haar wavelets. In addition, we have proposed additional scale-invariant interest point detectors based other Hessian feature strength measures and a scale-invariant corner detector based on the scale-normalized rescaled level curve curvature of level curves.

Lindeberg (2015) ``Image matching using generalized scale-space interest points", Journal of Mathematical Imaging and Vision 52(1): 3-36, 2015. (PDF 11.9 Mb)
Lindeberg (2013) ``Image matching using generalized scale-space interest points", Proc. SSVM 2013: Scale Space and Variational Methods in Computer Vision, (Schloss Seggau, Graz region, Austria), Springer LNCS 7893: 355-367. (PDF 8.3 Mb)
Lindeberg (2013) ``Scale selection properties of generalized scale-space interest point detectors", Journal of Mathematical Imaging and Vision 46(2): 177-210. (PDF 2.7 Mb)
Lindeberg and Bretzner (2003) ``Real-time scale selection in hybrid multi-scale representations'', Proc. Scale-Space'03: Scale-Space Methods in Computer Vision, (Isle of Skye, Scotland), Springer LNCS 2695: 148-163. (PDF 220 kb)
Laptev and Lindeberg (2003) ``A distance measure and a feature likelihood map concept for scale-invariant model matching'', International Journal of Computer Vision 52(2/3): 97-120. (PDF 1.5Mb)
Lindeberg (1998) ``Feature detection with automatic scale selection''. International Journal of Computer Vision 30(2): 77-116. (PDF 3.5Mb) (Comprises the basic theory for scale-invariant interest points and image descriptors.)
Lindeberg (1998) ``Edge detection and ridge detection with automatic scale selection'', Proc. CVPR'96: Computer Vision and Pattern Recognition, (San Francisco, California), June 1996, pages 465-470. (PDF 1.6Mb) International Journal of Computer Vision 30(2): 117-154, 1998. (PDF 10.3Mb)
Lindeberg (1994) ``Junction detection with automatic selection of detection scales and localization scales'', Proc. ICIP'94: International Conference on Image Processing, (Austin, Texas), Nov 1994, volume I: 924-928. (PDF 290 kb)
Lindeberg (1993) ``Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention'', International Journal of Computer Vision 11(3): 283-318. (PDF 787kb). (The underlying algorithm for linking blobs and local extrema over scales is described in chapter 9 in Scale-Space Theory in Computer Vision as well as in chapter 7 in Discrete Scale-Space Theory and the Scale-Space Primal Sketch.)
Lindeberg (1993) ``On scale selection for differential operators'', Technical report ISRN KTH NA/P--94/05--SE. Shortened version in Proc. SCIA'93: Scandinavian Conference on Image Analysis, (Tromso, Norway), May 1993, pages 857-866. (PDF 0.5Mb)
Brunnstrom, Lindeberg and Eklundh (1992) ``Active detection and classification of junctions by foveation with a head-eye system guided by the scale-space primal sketch''. In: Sandini (ed.), Proc. ECCV'92: European Conference on Computer Vision, (Santa Margeritha Ligure, Italy), May 1992, Springer LNCS 588: 701-709. (PDF 0.5Mb)
Brunnstrom, Eklundh and Lindeberg (1990) ``On scale and resolution in active analysis of local image structure'', Image and Vision Computing 8(4): 289-296.

Object recognition

Approaches to object recognition based on histograms of receptive field responses computed based on the scale-space framework.

Linde and Lindeberg (2012) ``Composed complex-cue histograms: An investigation of the information content in receptive field based image descriptors for object recognition'', Computer Vision and Image Understanding 116(4): 538-560. (PDF 4.2 Mb)
Linde and Lindeberg (2004) ``Object recognition using composed receptive field histograms of higher dimensionality'', Proc ICPR 2004: International Conference on Pattern Recognition, (Cambridge, U.K), August 2004, volume 2: 1-6. (PDF 108 kb)

Deep networks

Deep networks that handle scaling transformations and other natural image transformations in a theoretically well-founded manner, preferably in terms of provable covariance and invariance properties.

Lindeberg, Babaiee and Kiasari (2025) "Modelling and analysis of the 8 filters from the 'master key filters hypothesis' for depthwise-separable deep networks in relation to idealized receptive fields based on scale-space theory", arXiv preprint arXiv:2509.12746. (Download PDF)
Pedersen, Conradt and Lindeberg (2025) ”Covariant spatio-temporal receptive fields for spiking neural networks”, Nature Communications, 16:8231: 1-14.
Perzanowski and Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations", Journal of Mathematical Imaging and Vision, 67:29: 1-39.
Jansson and Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536.
Lindeberg (2022) "Scale-covariant and scale-invariant Gaussian derivative networks", Journal of Mathematical Imaging and Vision, 64(3): 223-242.
Lindeberg (2021) "Scale-invariant and scale-covariant Gaussian derivative networks", SSVM 2021: Scale Space and Variational Methods in Computer Vision, Springer LNCS 12679: 3-14, 2021, extended version in arXiv:2011.14759.
Finnveden, Jansson and Lindeberg (2021) "Understanding when spatial transformer networks do not support invariance, and what to do about it", Proc. International Conference on Pattern Recognition (ICPR 2020), pages 3427-3434, extended version in arXiv:2004.11678.
Jansson, Maydanskiy, Finnveden and Lindeberg (2020) "Inability of spatial transformations of CNN feature maps to support invariant recognition", arXiv preprint arXiv:2004.14716.
Jansson and Lindeberg (2021) "Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges", Proc. International Conference on Pattern Recognition (ICPR 2020), pages 1181-1188, extended version in arXiv:2004.01536.
Lindeberg (2020) ``Provably scale-covariant hierarchical continuous networks based on scale-normalized differential expressions coupled in cascade´´, Journal of Mathematical Imaging and Vision, 62(1): 120–148.
Lindeberg (2019) "Provably scale-covariant networks from oriented quasi quadrature measures in cascade", Proc. SSVM 2019: Scale Space and Variational Methods in Computer Vision, Springer LNCS 11603: 328-340, preprint at arXiv:1903.00289.

Multi-scale processing of temporal data including temporal and spatio-temporal scale-space as well as temporal scale selection

Temporal and spatio-temporal scale-space concepts as well as methods for temporal and spatio-temporal scale selection.

Lindeberg (2025) "Time-causal and time-recursive wavelets", arXiv preprint 2510.05834.
Lindeberg (2023) “A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time”, Biological Cybernetics, 117(1-2): 21-59. (Download PDF)
Lindeberg (2018) "Dense scale selection over space, time and space-time", SIAM Journal on Imaging Sciences, 11(1): 407-441. (PDF 2.5 Mb) (Supplement 1.2 Mb)
Lindeberg (2018) "Spatio-temporal scale selection in video data", Journal of Mathematical Imaging and Vision, 60(4): 525-562. (PDF 8.2 Mb)
Lindeberg (2017) "Spatio-temporal scale selection in video data", Proc. SSVM 2017: Scale-Space and Variational Methods in Computer Vision, (Kolding, Denmark), Springer LNCS 10302: 3-15. (PDF 27.6 Mb)
Lindeberg (2017) "Temporal scale selection in time-causal scale space", Journal of Mathematical Imaging and Vision, 58(1): 57-101. (PDF 4.7 Mb)
Lindeberg (2015) "Separable time-causal and time-recursive spatio-temporal receptive fields", Proc. SSVM2015: Scale-Space and Variational Methods in Computer Vision, (Lege-Cap Ferret, France), May 31 - June 4, 2015. Springer LNCS 9087: 90-102. (PDF 6.1 Mb)
Bretzner and Lindeberg (1998) ``Feature tracking with automatic selection of spatial scales'', Computer Vision and Image Understanding 71(3): 385-392. (PDF 319 kb)
Lindeberg (1997) ``Linear spatio-temporal scale-space'', Proc. Scale-Space'97: International Conference on Scale-Space Theory in Computer Vision (Utrecht, Netherlands), July 2-4, 1997. Springer LNCS 1252: 113-127. (PDF 258 kb)
Lindeberg (1997) ``On automatic selection of temporal scales in time-causal scale-space'', Proc AFPAC'97: Algebraic Frames for the Perception-Action Cycle, (Kiel, Germany), Springer LNCS 1315: 94-113. (PDF 415 kb)
Bretzner and Lindeberg (1997) ``On the handling of spatial and temporal scales in feature tracking'', Proc. Scale-Space'97: International Conference on Scale-Space Theory in Computer Vision, (Utrecht, Netherlands), July 2-4, 1997. Springer LNCS 1252: 128-139. (PDF 195 kb)
Lindeberg and Fagerström (1996) ``Scale-space with causal time direction'', Proc. ECCV'96: European Conference on Computer Vision, (Cambridge, England), April 1996. Springer LNCS 1064: 229-240. (PDF 0.4Mb)

Video analysis

Methods for video analysis based on histograms of spatio-temporal receptive field responses, computed based on the scale-space framework and with a fully time-causal and time-recursive image operations over the temporal domain.

Jansson and Lindeberg (2018) "Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision, 60(9): 1369-1398. (Download PDF)
Jansson and Lindeberg (2017) "Dynamic texture recognition using time-causal spatio-temporal scale-space filters", Proc. SSVM 2017: Scale-Space and Variational Methods in Computer Vision, (Kolding, Denmark), Springer LNCS 10302: 16-28. (Download PDF)

Spatio-temporal image features, image descriptors, velocity adaptation and Galilean diagonalization with application to recognition of motion patterns, human actions and spatio-temporal events

Direct methods for recognizing spatio-temporal events with associated activities based on the local spatio-temporal image structure, without explicit inclusion of tracking mechanisms or other temporal trajectories. To handle a priori unknown relative motions relative to the observer, a general notion ofl ocal velocity adaptation is introduced. For parameterizing the spatio-temporal second-moment matrix/structure tensor and other related spatio-temporal image descriptors, we propose the notion of Galilean diagonalization, which gives a much more natural parameterization of purely spatial components and combined spatio-temporal relations compared to previous approaches in terms of eigenvalues that correspond to a non-physical rotation of space-time. These works also include the first formulation of local scale-adapted histograms of spatio-temporal gradients and optic flow, which can be seen as generalizations of the SIFT descriptor from space to space-time.

Laptev, Caputo, Schuldt and Lindeberg (2007) ``Local velocity-adapted motion events for spatio-temporal recognition'', Computer Vision and Image Understanding 108(3): 207-229. (PDF 2.2 Mb)
Laptev and Lindeberg (2004) ``Local descriptors for spatio-temporal recognition'', Proc. ECCV'04 Workshop on Spatial Coherence for Visual Motion Analysis, (Prague, Czech Republic), 2004, Springer LNCS 3667: 91-103. (PDF 757 kb)
Laptev and Lindeberg (2004) ``Velocity adaptation of space-time interest points'', Proc. ICPR'04: International Conference on Pattern Recognition, (Cambridge, UK), August 2004, volume I: 52-56. (PDF 570 kb)
Lindeberg, Akbarzadeh and Laptev (2004) ``Galilean-corrected spatio-temporal interest operators'', Proc. ICPR'04: International Conference on Pattern Recognition, (Cambridge, UK), August 2004, volume I: 57-62. (PDF 549 kb) (Longer technical report with more extensive theory and more experiments: 586 kb)
Laptev and Lindeberg (2003) ``Space-time interest points'', Proc. ICCV'03: International Conference on Computer Vision, (Nice, France), October 2003, volume I: 432-439. (PDF 1.0 Mb)
Laptev and Lindeberg (2003) ``Interest point detection and scale selection in space-time'', Proc. Scale-Space'03: Scale-Space Methods in Computer Vision, (Isle of Skye, UK), 2003, Springer LNCS 2695: 372-387 (PDF 1.1 Mb)
Laptev and Lindeberg (2004) ``Velocity-adaptation of spatio-temporal receptive fields for direct recognition of activities: An experimental study'', Image and Vision Computing 22: 105-116. Earlier version in Proc. ECCV'02 Workshop on Statistical Methods in Video Processing, (Copenhagen, Denmark), 2002, pages 61-66. (PDF 1.1 Mb)

Estimation of affine image deformations and direct computation of cues to surface shape including the theories for multi-scale second moment matrices/structure tensors and affine shape adaptation

Theories and algorithms for shape from texture and shape from disparity gradients based on local affine deformations of 2-D brightness patterns. Specifically, this framework includes a theory for local affine normalization of local image descriptors by affine shape adaptation, which makes it possible to define affine invariant image features and to perform affine invariant image and feature matching. These papers also outline the theory for multi-scale second-moment matrices, also referred to as multi-scale structure tensors.

Gårding and Lindeberg (1996) ``Direct computation of shape cues using scale-adapted spatial derivative operators'', International Journal of Computer Vision 17(2): 163-191. (PDF 0.8Mb)
Lindeberg and Gårding (1997) ``Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D brightness structure'', Proc. ECCV'94: European Conference on Computer Vision, (Stockholm, Sweden), May 1994, Springer LNCS 800: 389-400. (PDF 0.2Mb) Extended version in Image and Vision Computing 15: 415-434, 1997. (PDF 0.5Mb)
Lindeberg (1998) ``A scale selection principle for estimating image deformations'', Image and Vision Computing 16(14): 961-977. (PDF 374kb)
Lindeberg (1995) ``Direct estimation of affine deformations of brightness patterns using visual front-end operators with automatic scale selection'', Proc. ICCV'95: International Conference on Computer Vision, (Cambridge, MA), June 1995, 134-141. (PDF 311kb)
Gårding and Lindeberg (1994) ``Direct estimation of local surface shape in a fixating binocular vision system'', Proc. ECCV'94: European Conference on Computer Vision, (Stockholm, Sweden), May 1994, Springer LNCS 800: 365-376. (PDF 0.2Mb)
Lindeberg and Gårding (1993) ``Shape from texture from a multi-scale perspective'', Proc. ICCV'93: International Conference on Computer Vision, (Berlin, Germany), May 1993, 683-691. (PDF 0.5Mb) Extended version available as (printed) technical report ISRN KTH/NA/P--93/03--SE from KTH.

Structure and motion estimation (including visual control based on the 3-D hand mouse)

Methods for computing 3-D structure and motion from rigid point and line configurations that are projected from 3-D to 2-D using an affine projection model. The papers and the patent applications also show how the motion of a controlled object A can be controlled using motion estimates that are computed by visually observing another controlling object B (visual servoing), which we used for developing methods for human-computer interaction by 2-D and 3-D hand gestures.

Bretzner and Lindeberg (1999) ``Structure and motion estimation using sparse point and line correspondences in multiple affine views'', Technical report ISRN KTH/NA/P--99/13--SE. (PDF 473kb)
Bretzner and Lindeberg (1998) ``Use your hand as a 3-D mouse or relative orientation from extended sequences of sparse point and line correspondences using the affine trifocal tensor'', Proc. ECCV'98: European Conference on Computer Vision, (Freiburg, Germany), June 1998, Springer LNCS 1406: 141-157. (PDF 272kb)
Lindeberg and Bretzner (1999): ``Method and arrangement for controlling means for three-dimensional transfer of information by motion detection'', International patent application PCT/SE1999/000402, 1999 (now released).
Lindeberg and Bretzner (1998) ``Förfarande och anordning for överföring av information genom rörelsedetektering, samt användning av anordningen'', Swedish patent 9800884-0, March 1998 (now released).

Hand tracking and gesture recognition

Methods for real-time tracking of hand motions and recognition of hand poses based on scale-invariant image features, including the use of hand gestures for controlling other equipment using no other interface equipment than the user's own hand gestures. A real-time prototype system was demonstrated for the general public already in 2001, to try by themselves to experience how it is like to control other objects at distance using just hand gestures.

Bretzner, Laptev and Lindeberg (2002) ``Hand-gesture recognition using multi-scale colour features, hierarchical features and particle filtering'', Proc. FG'02: Face and Gesture, (Washington D.C., USA), May 2002, 63-74. (PDF 159kb)
Bretzner, Laptev, Lindeberg, Lenman and Sundblad (2001) ``A prototype system for computer vision based human computer interaction'', Technical report ISRN KTH NA/P--01/09--SE. Department of Numerical Analysis and Computer Science, KTH (Royal Institute of Technology), S-100 44 Stockholm, Sweden, April 23-25, 2001. Demo presented at the Swedish IT-fair Connect 2001, Älvsjömässan, Stockholm, Sweden, April 2001. (PDF 522kb)
Laptev and Lindeberg (2001) "Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features", Technical report ISRN KTH/NA/P-00/12-SE, September 2000. Shortened version in Proc. Scale-Space'01: Scale-Space and Morphology in Computer Vision, (Vancouver, Canada), July 2001, Springer LNCS 2106: 63-74. (PDF 270 kb) (PDF 726 kb)
Laptev and Lindeberg (2001) "A multi-scale feature likelihood map for direct evaluation of object hypotheses", Technical report ISRN KTH/NA/P-01/03-SE, March 2001. Shortened version in Proc. Scale-Space'01: Scale-Space and Morphology in Computer Vision, (Vancouver, Canada), July 2001, Springer LNCS 2106: 98-110. (PDF 375 kb)

Medical image analysis

Methods ford etecting brain activations in functional PET images and for automatically segmenting the brain from other tissue in an MRI image of a human head. In the European project Neurogenerator, we also developed a database with functional PET and fMRI images and cytoarchitectonically classified anatomical regions in the brain, including tools for metaanalysis to relate the functionally activated regions from different tasks to corresponding cytoarchitectonically defined neuroanatomical regions in the brain.

Undeman and Lindeberg (2003) ``Fully automatic segmentation of MRI brain images using probabilistic anisotropic diffusion and multi-scale watersheds'', Proc. Scale-Space'03: Scale-Space Methods in Computer Vision, (Isle of Skye, Scotland), June 2002, Springer LNCS 2695: 641-656. (PDF 219 kb)
Roland, Svensson, Lindeberg, Risch, Baumann, Dehmel, Frederiksson, Halldorson, Forsberg, Young and Zilles (2001) ``A database generator for human brain imaging'', Trends in Neurosciences 24(10): 562-564. (PDF 427 kb)
Rosbacke, Lindeberg and Roland (2001) "Evaluation of using absolute vs. relative base level when analyzing brain activation images using the scale-space primal sketch" , Technical report ISRN KTH NA/P--99/14--SE, 1999. Journal of Medical Image Analysis 5(2): 89-110. (PDF 665 kb)
Lindeberg, Lidberg and Roland (1999) "Analysis of brain activation patterns using a 3-D scale-space primal sketch" , Human Brain Mapping 7(3): 166-194, 1999. Earlier version presented in Proc. HBM'97: International Conference on Functional Mapping of the Human Brain Mapping, (Copenhagen, Denmark), May 19-23, 1997. Neuroimage 5(4): 393, 1997. (PDF 1.0 Mb)

Applications

Applications of scale-space techniques to different types of more specific computer vision problems:

Almansa and Lindeberg (2000) ``Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale-selection'', IEEE Transactions on Image Processing 9(12): 2027-2042. (PDF 1.5 Mb) Earlier version presented as Almansa and Lindeberg (1997) "Enhancement of fingerprint images using shape-adapted scale-space operators", pulished as Chapter 3 in J. Sporring, M. Nielsen, L. Florack, and P. Johansen (eds.) Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory, (Copenhagen, Denmark, May 1996), pages 21-30, Kluwer Academic Publishers/Springer, 1997.
Lindeberg and Li (1997) ``Segmentation and classification of edges using minimum description length approximation and complementary junction cues'', Proc. SCIA'95: Scandinavian Conference on Image Processing, (Uppsala, Sweden), June 1995, pages 767-776. Extended version in Computer Vision and Image Understanding 67(1): 88-98, 1997. (PDF 1.0 Mb)
Bretzner and Lindeberg (2000) ``Qualitative multi-scale feature hierarchies for object tracking'', Journal of Visual Communication and Image Representation 11: 115-129. (PDF 348 kb)
Shokoufandeh, Dickinson, Jonsson, Bretzner and Lindeberg (2002) ``On the representation and matching of qualitative shape at multiple scales'', Proc. ECCV'02: European Conference on Computer Vision, (Copenhagen, Denmark), May 2002. Springer LNCS 2352: 759-775. (PDF 600 kb)
Wiltschi, Pinz and Lindeberg (2000) ``An automatic assessment scheme for steel quality inspection'', Technical report ISRN KTH/NA/P--98/20--SE. (Extended TR: PDF 5.8Mb) Machine Vision and Applications 12(3): 113-128.
Wiltschi, Lindeberg and Pinz (1997) ``Classification of carbide distributions using scale selection and directional distributions'', Proc. ICIP'97: International Conference on Image Processing ICIP'97, (Santa Barbara, California), October 1997, II:122-125. (PDF 1.1 Mb) (Extended TR: PDF 0.4Mb)

External links

Encyclopedia entry on scale-space in Encyclopedia of Computer Science and Engineering
Wikipedia articles on scale-space
Wikipedia articles on feature detection
Wikipedia articles on computer vision
Wikipedia articles on gesture recognition
Scholarpedia article on scale invariant feature transform (SIFT)
Encyclopedia entry on scale-space theory in Encyclopedia of Mathematics (Local copy)
Encyclopedia entry on edge detection
Encyclopedia entry on corner detection
Powers of ten interactive Java tutorial at Molecular Expressions website.
The SIFT descriptor for object recognition includes a combined scale selection and blob detection stage to obtain scale invariant interest points for subsequent matching, where the differences-of-Gaussian blob detector (DoG) can be seen as an approximation of our scale-space extrema of our scale-normalized Laplacian operator (developed by David Lowe at University of British Columbia).
The SURF descriptor for image matching and object recognition builds upon similar ideas of automatic scale selection to obtain scale invariant interest points with associated image descriptors. Specifically, the blob detector in the SURF descriptor can be seen as an approximation of our scale-space extrema of our scale-normalized determinant of the Hessian operator with the underlying second-order Gaussian derivatives replaced by Haar wavelets and using an constant L1-norm normalization of the responses to obtain appropriate scale selection properties in analogy with our discrete normalization of operator responses in a real-time hybrid pyramid representation (Herbert Bay, Tinne Tuytelaars and Luc Van Gool at ETH in Zurich).
Integration of automatic scale selection and affine shape adaptation into schemes for performing wide baseline stereo matching (Adam Baumberg at Canon Research in Guildford, U.K.; Frederik Schaffalitzky and Andrew Zisserman at Oxford University; Tinne Tuytelaars and Luc Van Gool at University of Leuven, Belgium) and for detecting scale invariant and affine invariant interest points (Krystian Mikolajczyk and Cordelia Schmid at INRIA Rhone-Alpes. The affine invariant and scale invariant properties of these approaches are based on our theories for scale-invariant image feature and affine invariant fixed points. Our previously developed scale-adaptive image features in terms of scale-space extrema of the scale-normalized Laplacian, the determinant of the Hessian, the scale-normalized rescaled level curve curvature were indeed originally defined to be truly scale invariant. Specifically, our notion of gamma-normalized derivatives was derived axiomatically from the requirement that image features obtained from local extrema over scales should be preserved under scaling transformations and be transformed in a scale covariant way. Moreover, our affine invariant fixed-point property of the second-moment matrix implies that provided that our proposed affine shape adaptation procedure converges, the resulting affine normalized image patch should be preserved under affine transformations and transformed in an affine covariant way. This in turn means that the corresponding features will be affine invariant up to an unessential undetermined rotation that can be easily compensated for by a complementary orientation alignment.
Using combinations of scale selection and model-based ridge detection for blood vessel segmentation (Alejandro Frangi et al at Utrecht University) and detection of tubular structures (Karl Krissian et al at INRIA Sophia Antipolis) in medical 3-D images, which constitute extensions of our scale-adaptive and scale-invariant ridge detection methodology from 2-D to 3-D.

Further reading

Further publications on these and related topics are available from:

Google Scholar
ResearchGate
KTH Publication Database DIVA
List of publications of the Computational Vision and Active Perception Laboratory at KTH (updated only until 2003/2004)

Tony Lindeberg,
Professor
tony@kth.se
+46 8 790 62 05

Utbildning

Forskning

Samverkan

Om KTH

Bibliotek

Scale-Space Theory

Portfolio

Kontakt