PolBeRG seminar - Web Scraping

March 14, 2014 - 17:30 - 19:00
Nador u. 9, Faculty Tower
Event type: 
Event audience: 
CEU Community Only
Juraj Medzihorsky
CEU contact person: 
Martin Mölder

This Friday we will have a different kind of a PolBeRG meeting, which will be somewhat of a mixture between a workshop and a seminar. A lot of data that we as social or political scientists could potentially use is online, but most often in a not very analyst-friendly form, scattered across web pages and domains. Manually going from page to page and copy-pasting each piece of information to your computer into a format that you can work with will take usually a very long time. All in all there might be more data online that you could manually ever handle. Luckily, with a little bit of easy programming (in R, C#, php or some other language), much of this process can be automated and what would previously have taken forever can eventually be done in a few minutes or hours. Juraj and Carl have both in the course of their work done quite a bit of web scraping and this Friday they will come to PolBeRG to show us what and how they have done. It will be an introduction to what is possible, so that in the future, if you are not familiar with this yet, you would know what can be done and where to start. And if you have an idea of something that you would like to scrape from the web (if you do not, think of one), bring it along with you. In the end of the workshop, we would have time to take a very brief look at how some of your ideas for scraping could be implemented.