Release 0.3 End (OSD600)

I have found an external open source project that I want to work more seriously for my OSD600 class. Well, I didn’t just find it. I mentioned before that I was interested in working on MTGATracker, which allows users of the online card game Magic The Gathering; Arena players to keep track of what is in their deck (and many other useful things). When I had initially looked at it, I thought it might be interesting, but I was later scared away. It looked kind of complicated and didn’t seem to have any low-hanging fruit for me to grab. Not only have I not used python in a while, but never in conjunction with other things. I don’t particularly know much about UI building either. But, after coming back to it several times during my search, I have decided to try and dive into the code and figure out how it works.

But there’s not time to familiarize myself with it yet, so I decided to do something quicker. While mulling things over, I ended up looking at one of the repositories I previously contributed documentation to, and mulled over the idea of doing some more of the same. I noticed, however, that none of the links in the README file actually worked. I expected this to be a 30-second fix I could tack onto the documentation I would later write, so I jumped in to make the quick change.

Except it wasn’t that simple. I didn’t understand how the link references worked. Even after looking it up, I still didn’t quite get it. How did it know where to send the user? Eventually I found something unhelpful that nudged a gear in my brain and gave me an a-ha moment. There was a conversion algorithm linking things together. The link targets had to be using heading markdown, and they had to be written in a specific way. Rushing to put newly tested theory into practice, I ended up breaking the entire bullet-point hierarchy too. I had to reconcile two different structures that didn’t work together, and implement that new structure throughout the entire documentation. Was it the most difficult thing in the world? Perhaps not, but I do feel like I learned a little bit more than if I just repeated the same documentation gig as last time.

 

On the Internal project front, I’m still working on DarkChatter. I’m finally kitted out with the essentials for the project (a Fedora virtual machine, since Windows comes with extra roadblocks; a USB-network card to hook into my VM; and the myriad software packages involved with advanced network wizardry).

My next task was to integrate the command line argument research I had done last time into the actual project. As I started to edit the main python script that Ryan had been fiddling with, I realized two things. First, is that it was a “test” file of sorts. ScappyTest.py need not resemble anything finally get into the final code. Therefore piggybacking onto it and inserting my bits may be a waste of time. The other thing I realized was that I didn’t have to put everything in one file. My memories of my Python are a bit foggy, but back then I didn’t know of the magic of writing maintainable code and separating unrelated content. Better to leave that world behind. I split up the command line arguments logic into one file, and the main code in another. Then I built out the overarching entry logic that determined whether our script was sending or receiving packets. The code we eventually add for that can then be slid into place.

Advertisements

Release 0.3 Update 2 (OSD600)

This semester keeps going faster by the day. It seems like every course has its own project to do, and the internal projects for my open-source course seem to simultaneously be most and least intimidating. Getting a project off the ground within the last six weeks of class seems impossible, and scary. The fact that there is no hard final product to produce alleviates the pressure a bit, but I would much rather take a slow and thorough approach to it. My opinion on this comes from two things.

First is the chaos of the project being pulled in various directions. I had originally hoped to play a bit of a manager role on the student project DarkChatter begun by Ryan and I. While Ryan seemed engrossed by the technical side of things, I felt that someone needed to chart a course on what exactly was being built. I will admit that I was slow on doing this. On top of juggling my other courses, I had to do some research to bring myself up to speed on the subject, and looking forward to the future. Hesitant to start building something that might not even be feasible, I really took my time on this.

In the meantime, several people hopped onto the project and started pulling it it new directions. Suddenly we were a multi-platform chat app with 3 different repositories, and I felt a little overwhelmed. All the plans I had brewing in my mind suddenly seemed to not fit into the whole, and it had out-scoped any managerial aspirations of mine. This isn’t necessarily for the worse, though. It does seem possible for the three different parts of the project to co-exist. If someone else wants to work on an iOS messaging app that leverages the tech, more power to them. This let’s me in turn focus more on the part I find interesting and unique: the back-end. But I worry about the likelihood of it all coming together by the end of the semester.

The second issue I ran into recently is getting everything running. It turns out the stuff we are doing with wifi cards and monitoring for dangling packets is a lot easier in linux, meaning I had to try and get a linux machine running for myself. I didn’t want to do this on the school’s matrix server initially, because I wasn’t sure if I could or should be doing weird networking nonsense off of their machines. This meant falling back onto my usb copy of mint. Except it didn’t seem to work on my laptop. It worked on my desktop, but I couldn’t use it on there for other reasons. The time spent hunting for a linux machine just makes me lament the time crunch even more.

So, what did I end up getting done after all this research and linux-hunting? Well, I’m still no networking wizard, so I started to do some work elsewhere in the project. I learned how to implement command line arguments and started to set them up. While there is still the matter of hooking it up to something useful on the networking side, I took the opportunity to create a command line argument guide/template to aid any contributors (and probably future-me). Next goal: make the arguments actually worth using!

Software Optimization Project Plan (SPO600 Assignment Stage 1)

Over the next few weeks I will put what I’ve learned in Software Optimization and Portability to the test by attempting to speed up some open-source hashing software. The hashing tool in question is called cityhash, which seems to come from Google.It hashes strings, and is written in C++. I chose it because it seems to be legitimate, and claims decent performance, but  still doesn’t seem over-streamlined to the point where I am not capable of contributing anything. That may still be the case in the end, but I think the potential is there.

Benchmarking

I hope to build the project on two systems. My own computer (an x86_64 system) and one of the school servers of the aarch64 architecture for variety. Using the test code included in the package as a guide, I will make a large set of strings to increase the runtime. Using the profiling tool perf, I will measure the overall and function-specific run-times of a specific hash function (probably CityHash32 or CityHash64) and take the average over several runs. The same data set will later be used to check for speedups.

 

Optimization

There are a variety of options to explore when it comes to optimizing this function. The simplest way seems to be messing with compiler options, which I may play with, but the README suggests that they are already aware of the more general compiler optimizations available (such as gcc preferring “-g -O3” over the default “-g -O2”). The actual functions themselves seem too complex for me to significantly rewrite. I also find it likely that using in-line assembler will backfire on me, as I have read that except in very specific cases, it often leads to a slow-down.

Therefore, I have my eyes set on vectorizing the code. That is, to group similar operations and do them simultaneously rather than just one by one.  There are even notes in the code that suggest that no SIMD (Single Instruction Multiple Data) operations are done, so I think provides a good candidate for improvement.

There process of doing so may be quite difficult. I am reading up on vectorization, and a conversation with my professor suggested that process of vectorizing this code may take me on a bit more complex path than I intended. I am however up for the challenge, and will carefully learn about masking and using vector registers. Fortunately, I have until stage 3 to bring myself up to speed. I’m kind of excited to be getting my hands dirty with something like this.

Since the project comes with a testing suite, I will run the tests to make sure nothing is broken. I will also do a cursory check to make sure the outputs are coming out the same. To prove performance increases, I will compare both overall runtime and relative runtime for the function in question, on both systems. I will focus mostly on speed, and not memory or other resource dimensions, though if I stumble onto something of interest I will include it.

Building a collaborative project from the ground up (OSD600 0.3 week 1)

A New Project

With the start of a new assignment cycle, the time has come to start an open source project. For the 0.3 release of my open source development course, my classmates began brainstorming new ideas for internal projects that we could envision making for the remainder of the course. A friend of mine had several ideas, but one appealed the most to me: An application that used a device’s network card to send data directly to another network card, cutting out the middle-man (without internet, or even LAN). After telling him how much I liked the idea we pitched it to the class.

This DarkChatter  was born. In my eyes, the details are still a bit up in the air, however. Would this be a full blown chat application, or just message-in-a-bottle deal? Would it fully anonymous, or somewhat anonymous with persisting identities? Would it be a mobile application, or would the permission issues or required rooting make PC the preferred platform? (edit: it seems that this may be easier for an iPhone but I’m an android user).

I’m still not sure yet where it will go. But in all of this I know where my preference lies. I think I want to be closer to the lower levels on this. What is going on with the hardware, the packets, and what difficulties are there in doing this? Why is  this not already a more popular thing? Whether it is a chat app or not is not the biggest factor for me. However, first we need to get this off the ground, and get people on board.

For that reason I consider this a research week. Some of the choices that need to be made must be done by someone who is informed on the topic. As I am not an expert at networking or app development, I want to prime myself on the relevant topics.

Research Begins

The project’s original mastermind pointed me to Scapy  first. Scapy is a tool that allows you to craft packets in any way you want, even if it doesn’t seem to make sense. This does seem useful, since we are likely to be doing some non-standard networking wizardry, and this will allow us to customize things to our needs. Though it is based in python, which makes me wonder how to get it running in the context of a larger codebase in another language.

After reading a bit about Ad-hoc Networks and going down the rabbit hole to read about Mesh Networking, I realized that there could have many nodes communicating at once. However, I am probably still against group chatting in the interest of time and reliability. It also got me wondering as to the distance limits and how chaining would work. Could this be extremely useful in areas of high population density? Or would other people be slowing down my device by using me to pass data as a middleman? But, back on topic…

After looking up a list of similar applications, I found that Bridgefy  offers an SDK for communicating without Internet for up to 100 ft. It doesn’t however say that it is free, so I am afraid that it is likely not. The silver lining is that if they are able to deliver these services without jail-breaking anything, perhaps we can too. But the lack of existing open source projects in this realm is going to hurt, and require me to do much more research with less hand-holding. Hopefully I won’t be in over my head.

Hacktoberfest Retrospective (OSD600)

Over the past month, I have taken part in an event called Hacktoberfest, taking some first steps into the collaborative world of open source programming. The goal was to make 5 pull requests on github.com over the month of October, earning a T-shirt and credit toward my Open Source Development course. A list of Pull Requests and their associated Issues is provided below.

Issue 1        PR 1           Blog Post 1

Issue 2        PR 2           Blog Post 2

Issue 3        PR 3           Blog Post 3

Issue 4        PR 4           Blog Post 4

 

Unfortunatley, I did not finish all 5 Pull Requests. Nevertheless, I think the entire process was still quite positive. There were a lot of things I learned along the way:

 

  1. Open Source isn’t scary. (You can contribute too!)

It can be pretty intimidating to get started in the open-source world. For someone who doesn’t have much experience in building anything real  or useful, it can feel like there is always more you need to learn before you get to work on such projects. But it turns out, you can always find projects that need help, whether those contributions are large or small, they will be appreciated. If you make a mistake, things can be reversed, and people will help you understand what went wrong. Furthermore, people will hold off on solving certain problems so that a beginner might have the chance to use it as a learning opportunity. This makes everything more welcoming, and gives programmers of all skill levels the freedom to find their own way to contribute

 

2.  Get your life in order

It goes without saying that if I couldn’t finish all the required PRs, then something went wrong. It wasn’t a time management problem in the sense that I forgot the work or overlooked it, but in the end it is the results that matter. If I were more strict with my time for even one day, I think I could have avoided this issue. Even if you are feeling tired and unproductive, working at 30% capacity can be better than not working at all.

 

3. Don’t expect things to be as quick or as simple as you expect

Whether it is due a typo, or a deep misunderstanding of how the code functions, there will be times where you think requires 30 seconds or less of work. Complications can appear at every turn. The code rewrite might be just one line, but installing the package in order to run the build locally could take forever (if you get stuck behind a wall of errors). Making a small change might break things that it shouldn’t, and some things might be intentionally written a certain way for a certain reason. You might make what seems like an obvious upgrade only to find out that it needs to be adjusted. You might end up making 8 commits instead of 1, even though you thought it was so easy. Which leads me to…

 

4. The PR isn’t the end of the story

Although for this event, the goal was to make Pull Requests, it is possible that you may need to make adjustments before it is merged into the project. Although I knew this going in, it was surprising how even a minor change could lead to a long back-and-forth or significant rewrite of code. This turned into an amount of debt going forward, as  I needed to continue and find more issues despite not having fixed the Pull Requests that lay waiting to be fixed. There are still some that I intend to go back to, but it may take some time.

 

Having learned from this experience, there are a few things I would change if I had to do this all over again.

The first is finding an easier way to find the types of issues I wanted. I found it quite difficult to search when half of the issues in the search were on repositories for “easy hacktoberfest PRs”, or for code challenges. Although code challenges are not without value, a lot of repositories using the hacktoberfest label didn’t seem to be in the spirit of an open-source software celebration. Many repos seemed to be created to break the event’s rules, too. My only solution to this would be to craft better searches, which I began to do towards the end of the month.

The other thing I would do is maximize the amount of time I could dedicate to it. Although I didn’t need an incredible amount of extra time to work on it, I think if I had fewer courses and better management of sleep and energy levels, I would have a more positive experience. I have found that I have many interesting courses this semester, all of which have an incredible amount of depth to them.  Beyond just completing my work, I would have loved to  play around with things in several of them, including my work in open source. By getting deeper involvement on these projects, and with less changing of gears as I switch my focus back and forth, I could have grown more.

 

Conclusion

I really enjoyed taking part in Hacktoberfest, and I will likely continue to find projects to work on in my spare time. I do however, expect to be very picky when picking my projects. Working on various types of projects, and getting more experience with git make me feel like more of a programmer, and this first push has made it easier to continue later on.

Hacktoberfest Part 4 (OSD600)

Hacktoberfest continues! Finding issues to fix on github.com projects has been surprisingly challenging. The amount of startup time on getting involved with a project really depends on a lot of things. Large projects, either in number of files or number of people, or even small ones in a language you don’t understand take some research before you can get involved. That isn’t to say it is impossible, but I have found it very easy to dismiss projects on these grounds unless there was something bigger attracting me to a project and making it seem worth the burden.

This time my issue was writing documentation for Android JetPack, explaining how to use EmojiCompat. EmojiCompat allows android apps to use Emojis, including ones that would normally not render because the font package included with the operating system version does not include the newer Emojis that have been released. A video was provided with a rundown on how it is used, and I based the documentation on that.

While not technically challenging, I did find it difficult to describe what was happening in the mandated commenting of code examples. Perhaps I stuck too much to writing code and not enough prose.

Hacktoberfest Part 3 (OSD600)

Continuing with finding issues for Hacktoberfest , I came across an issue  on a repository  for a project that allows users to manipulate images in a way that resembles works of the artist Kensuke Koike.  In creating the project, the author decided to limit the size of user loaded images via cropping. Since most users are unlikely to want their images cropped, it was requested that this be replaced with image scaling.

I installed the project files, and necessary packages, however it didn’t build properly. So I ended up manually installing parcel-builder globally on my computer, which allowed me to build the project and test it out. Finding a large enough photo, I did in fact confirm that cropping was an issue and that it hadn’t secretly been fixed already. Creating a new git branch, I started to get to work.

My expectation was that I would have to find a new library, or to write up an entirely new method to get this done. However, it seemed that, bluimp-load-image, the library already being used already had this functionality.

The original code looked something like this:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
      e.target.files[0],
      imageSetup,
      {maxWidth: 2000} // Options
  );
};

After finding scaling methods in the library that loadImage was from, I found that it could be fixed with very few changes:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
      e.target.files[0],
      imageSetup
  ).scale(2000);
};

Getting ready to post my pull request, I paused for a moment. Scaling it down to a width of 2000 was the request, but is that what this was doing? If the image was smaller, would it be scaled up? That is probably a waste of time, as the limit was imposed to prevent slowdown in the first place.

So I replaced it with this:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
    e.target.files[0],
    imageSetup
  ).scale({maxWidth: 2000});
};

Checking again that everything worked properly, I tested with a few more images and was confident that my solution was satisfactory.