Learning By Doing
Several months ago I made two resolutions. One was to learn how to program, the other was to write about it more on this blog. I’ve partly kept one of those. The other, not so much.
Part of the reason is that I don’t feel that I had quite as much to say as I expected I would. I was reading some books and following along in some online communities, and that didn’t really sound interesting.
Now, though, I have a project. A personal itch that needs scratching. It’s small, and it’s pretty banal in the scheme of things, but it is kind of gnawing at my brain.
For my job, I find myself sorting large numbers of PDFs pretty often. The details aren’t important, but suffice it to say that this is something I’m sure a computer can do much more quickly than I can. So I set about figuring out how to get this done.
My first inclination was to come up with a Python script. I’m sure that it is possible in Python, but if I wanted to get into the raw data inside the PDF, I thought I’d need something more powerful. My next thought was to write something in C. This seemed like it would be way more effort than it was worth, since my knowledge of C is… well, lets just say I read a book on it once, and still own that book, so at least I have that.
My next thought was Swift, but again, there’s a serious learning curve there. Finally, I decided on Objective-C. Xcode offers an option for “Command Line Utilities”, so why not actually create one?
This is exciting because it’s my first time diving into an Objective-C project without a net. I honestly have no way of knowing how far above my head this project is until I hit a wall, and at that point there’s no flipping to the back of the book.
I started my project on Wednesday, after preparing for Thanksgiving. After working for a few hours here and there over the weekend, I currently have two classes totaling a couple hundred lines of code. I roughly sketched out the plan. The first goal was to find all of the PDFs in a directory. I got that part down. It took me a while, figuring out when to use NSArrays and when to use NSMutableArrays.
Beyond that rather basic syntax, my biggest confusion so far has been navigating the file system from inside an app. You have NSFileManager available to you, but some methods take NSURLs, and some take paths. How exactly do they work? Can I convert between the two? Are they really giving me what I want? How do I check?
I think I have that more or less sorted out. But now comes the hard part. Digging into a PDF file and seeing what I can pull out. I know the raw bytes I’m looking for are there. My worry now is separating the signal from the noise.