IDEAS home Printed from https://ideas.repec.org/p/crs/wpaper/2017-86.html
   My bibliography  Save this paper

Computational Optimal Transport

Author

Listed:
  • Gabriel Peyré

    (CNRS and DMA, ENS)

  • Marco Cuturi

    (CREST, ENSAE)

Abstract

Optimal Transport (OT) is a mathematical gem at the interface between probability, analysis and optimization. The goal of that theory is to define geometric tools that are useful to compare probability distributions. Let us briefly sketch some key ideas using a vocabulary that was first introduced by Monge two centuries ago: a probability distribution can be thought of as a pile of sand. Peaks indicate where likely observations are to appear. Given a pair of probability distributions—two different piles of sand—there are, in general, multiple ways to morph, transport or reshape the first pile so that it matches the second. To every such transport we associate an a “global” cost, using the “local” consideration of how much it costs to move a single grain of sand from one location to another. The goal of optimal transport is to find the least costly transport, and use it to derive an entire geometric toolbox for probability distributions. Despite this relatively abstract description, optimal transport theory answers many basic questions related to the way our economy works: In the “mines and factories” problem, the sand is distributed across an entire country, each grain of sand represents a unit of a useful raw resource; the target pile indicates where those resources are needed, typically in factories, where they are meant to be processed. In that scenario, one seeks the least costly way to move all these resources, knowing the entire logistic cost matrix needed to ship resources from any storage point to any factory. Transporting optimally two abstract distributions is also extremely relevant for mathematicians, in the sense that it defines a rich geometric structure on the space of probability distributions. That structure is canonical in the sense that it borrows, in arguably the most natural way, key geometric properties of the underlying “ground” space on which these distributions are defined. For instance, when the underlying space is Euclidean, key concepts such as interpolation, barycenters, convexity or gradients of functions extend very naturally to distributions when endowed with an optimal transport geometry. OT has a rich and varied history. Earlier contributions originated from Monge’s work in the 18th century, to be later rediscovered under a different formalism by Tolstoi in the 1920’s, Kantorovich, Hitchcock and Koopmans in the 1940’s. The problem was solved numerically by Dantzig in 1949 and others in the 1950’s within the framework of linear programming, paving the way for major industrial applications in the second half of the 20th century. OT was later rediscovered under a different light by analysts in the 90’s, following important work by Brenier and others, as well as in the computer vision/graphics fields under the name of earth mover’s distances. Recent years have witnessed yet another revolution in the spread of OT, thanks to the emergence of approximate solvers that can scale to sizes and dimensions that are relevant to data sciences. Thanks to this newfound scalability, OT is being increasingly used to unlock various problems in imaging sciences (such as color or texture processing), computer vision and graphics (for shape manipulation) or machine learning (for regression, classification and density fitting). This paper reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications. Our focus is on the recent wave of efficient algorithms that have helped translate attractive theoretical properties onto elegant and scalable tools for a wide variety of applications. We also give a prominent place to the many generalizations of OT that have been proposed in but a few years, and connect them with related approaches originating from statistical inference, kernel methods and information theory. A companion website 1 provides bibliographical and numerical resources, and in particular gives access to all the open source software needed to reproduce the figures of this article.

Suggested Citation

  • Gabriel Peyré & Marco Cuturi, 2017. "Computational Optimal Transport," Working Papers 2017-86, Center for Research in Economics and Statistics.
  • Handle: RePEc:crs:wpaper:2017-86
    as

    Download full text from publisher

    File URL: http://crest.science/RePEc/wpstorage/2017-86.pdf
    File Function: CREST working paper version
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hadrien De March & Pierre Henry-Labordere, 2019. "Building arbitrage-free implied volatility: Sinkhorn's algorithm and variants," Papers 1902.04456, arXiv.org, revised Jul 2023.
    2. Cyril B'en'ezet & Jean-Franc{c}ois Chassagneux & Mohan Yang, 2023. "An optimal transport approach for the multiple quantile hedging problem," Papers 2308.01121, arXiv.org.
    3. Hadrien de March & Pierre Henry-Labordere, 2019. "Building Arbitrage-Free Implied Volatility: Sinkhorn'S Algorithm And Variants," Working Papers hal-02011533, HAL.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:crs:wpaper:2017-86. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Secretariat General (email available below). General contact details of provider: https://edirc.repec.org/data/crestfr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.