Web Scraping with Object Oriented Programming
Learning Object Oriented programming(OOP) allows users to effectively model the real world. Learning SQlite, Ruby/Rack, and ActiveRecord I now have the ability to create my own API which stores data that interests me. The functionality of these languages makes creating a backend relatively simple.
Creating my own backend API strikes me as being a big step in my coding. I am no longer restricted to using APIs that are made available at someone else’s discretion. I can work on projects and track data that is interesting to me. As nice as it is to be able to create my own APIs it’s still a rather monotonous task to keep adding/updating data to an API with migration files or through a frontend feature. This is where the power of Scraping comes into play.
Scraping is a process that combs over websites and extracts data that you choose. Essentially, scraping is used to set up bots to collect data that you want to analyze and stores it for that purpose. Scraping can be done using Ruby.
Scraping works by sending HTTP requests to a server, parsing code, and storing that data — all of which can be handled with SQLite, Ruby and ActiveRecord. In order to work properly, there are some gems which need to be installed in order for scraping to work. It is also more complicated to scrape dynamic pages/single page applications, but it can be done with different tools. There are also several pitfalls in scraping because website traffic isn’t from a user. However, with the proper resources and measures, scraping can be done and used effectively for many means.
Scraping has many applications that range from social media analysis, Ecommerce, investing, machine learning and many other application in other industries. In order to set up a be scraper you must first decide on what you are looking for. What is your end goal? What will you do with the scraping data? How will data be used? How will data be presented? Who are the user who will use your scraping tool? How will you store data? How will data be presented? How often will you scrape for data?
In order to set up a web scraping tool that works for you, you must ask and answer these questions. Once those question are answered you can move in to the finer technical details of performing a web scrape for an application. For my uses, web scraping would work well using SQLite, and Ruby. I would, for example, use Ruby as a language for both scraping purposes and for use on the back end in conjunction with ActiveRecord. I would then choose a target website to scrape off of. Once I have the URL I would then be able to make requests to get the target HTML of the page.
Initially parsing the HTML will be one of the most difficult steps as the HTML may change or update. This may lead to your entire application crashing and needing adjustment. There is also an issue of checking how often you are making requests from a specific server. In setting up a web scraping application, it is important to test that you are returning the correct data. Once you are sure you are returning the right data, you can then save or store the data in a standard, structured format such as JSON. Those are the primary steps in setting up a web scraping service. However, half the difficulty is in maintaining and solving challenges during development.
Scraping is a great option for its ability to safe users time, cost, speed, error, and convenience. However there is a consideration when deciding whether or not to use a web scraper. It is not always right to scrape a website. Websites have property rights which need to be adhered to. As long as data collection is freely available to third parties you have the green light to scrape data. Scraping is also not only used for beneficial applications. Spam is often a result of web scraping, as an example, where scrapers accumulate email addresses and information.
Web scraping is a powerful tool. You can gleam information from social media, find lowest prices on Ecommerce sites, or even gain information from article databases. I am excited to be able start collecting data for use on my projects. I feel that learning OOP along with scraping has grown my potential as a programmer. I now feel I am able to take on more interesting problems and create more meaningful projects. I am hoping that I will be able to create a web scraping application which can aggregate findings in medical publications. I think the opportunity that web scraping provides is immense and I look forward to building out some projects for my portfolio.