Scraping Smarter with Content Filtering

Scrapy is a powerful web scraping framework and essential tool for building machine learning datasets. For sites with simple structure, scrapy makes it easy to curate a dataset after launching a spider. Check out the tutorials in scrapy’s documentation. To train a poster similarity model, we first gathered hundreds of thousands of movie posters. More concretely, when scraping IMDb.com, we may be interested in gathering posters from <img> tags under <div> tags with the class "poster"....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Population Health Modeling

In a matter of months, the COVID-19 pandemic has besieged humanity and now the world wrestles to manage the population health challenges of a novel coronavirus with remarkable infectivity. Organizing an effective response to blunt the impact of such a large, complex challenge demands a principled and scientific approach. Better Planning by Forecasting Infections Reliable forecasting is crucial for planning and allocating limited resources efficiently and minimizing casualties....

 · 6 min · Terry Rodriguez & Salma Mayorquin