Module 2 - Attack of the APIs

I probably should have written this when I had actually finished the module, but better late than never, right? Anyway, like Episode II of Star Wars, Module 2 was my least favourite module. However, unlike Episode II, it was also the module where I pushed myself the most and learned because of it. I went through the first three exercises for this module.

The results of Exercises 2 and 3 can be found here. The first exercise results are in the research notebook.

The first exercise was simple enough. The Dream Case exercise had me searching a few databases for information - as discussed in my research notes, the one on the Epigraphic Database Heidelberg site came up with nothing (I tried to search 'coquit' and variants because I love cooking and it's one of the few words I remember from Grade 11 Latin class). The CWGC database provided me some information about folks with the same last name. It was interesting to see information about people who might be distant relatives! This exercise was just about searching terms, which is a handy skill to have but nothing new to me. Here is my notebook entry about this exercise.

Then things got interesting.

The Outwit Hub exercise was also mostly about following instructions, but I certainly learned a lot more from it. Outwit Hub was an entirely new program for me, and I had never scraped data before. I am not sure I did things ENTIRELY correctly, as my "Descriptions" column seemed to be offset by one row compared to my "Adler" column. I think that this could definitely be a useful tool for obtaining specific information from databases. I also felt that I had grasped the process well enough that I was comfortable helping out a fellow student out in using Outwit Hub on the Slack space. Here's the notebook entry about this exercise.

And then there was the API exercise. I will admit that I made incoherent yelling noises at my computer more than once. I found the instructions hard to follow regarding what needed to be downloaded and what did not. I ended up realising I was missing things or things were named incorrectly the first several times I tried to run the program that queried the API. It took some figuring out to realise exactly what was wrong each time - which parts I was missing or why it wasn't detecting something. The latter was usually because the file was named incorrectly or the program was looking for something with a different name. Eventually I got the whole thing running, and the output file was big enough to crash regular Notepad, which is the default program on my computer for .txt files. Notepad++ was strong enough to open it, though. When I ran the splitting program included with the zip file of all the required programs, I forgot to move things to a new folder first, and it plunked over 32 000 files into the same folder as all the other files needed for the exercise. I do not think my computer was pleased when I moved all of them into a separate folder.

I really do understand the value of the third exercise. I think it is fantastic to have a way to pull data directly from the search engines and be able to have that information in plain text to analyze and work with as I please. However, getting everything to work was probably the most frustrating part of this entire course. But now I know how to do it! Here's the notebook entry for this exercise, further elaborating my frustration.

Written on April 3, 2016