Date: Mon, 19 Aug 2019 20:12:43 +0000
<p>Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.</p> <p>This episode is a discussion of the <a href="https://www.di.ens.fr/willow/research/howto100m/">HowTo100m</a> dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.</p> <h3>Related Links</h3> <p>The paper will be presented at <a href="http://iccv2019.thecvf.com/">ICCV 2019</a></p> <p><a href="https://twitter.com/antoine77340">@antoine77340</a></p> <p><a href="https://github.com/antoine77340">Antoine on Github</a></p> <p><a href="https://www.di.ens.fr/~miech/">Antoine's homepage</a></p>