Release 0.3 Update 2 (OSD600)

This semester keeps going faster by the day. It seems like every course has its own project to do, and the internal projects for my open-source course seem to simultaneously be most and least intimidating. Getting a project off the ground within the last six weeks of class seems impossible, and scary. The fact that there is no hard final product to produce alleviates the pressure a bit, but I would much rather take a slow and thorough approach to it. My opinion on this comes from two things.

First is the chaos of the project being pulled in various directions. I had originally hoped to play a bit of a manager role on the student project DarkChatter begun by Ryan and I. While Ryan seemed engrossed by the technical side of things, I felt that someone needed to chart a course on what exactly was being built. I will admit that I was slow on doing this. On top of juggling my other courses, I had to do some research to bring myself up to speed on the subject, and looking forward to the future. Hesitant to start building something that might not even be feasible, I really took my time on this.

In the meantime, several people hopped onto the project and started pulling it it new directions. Suddenly we were a multi-platform chat app with 3 different repositories, and I felt a little overwhelmed. All the plans I had brewing in my mind suddenly seemed to not fit into the whole, and it had out-scoped any managerial aspirations of mine. This isn’t necessarily for the worse, though. It does seem possible for the three different parts of the project to co-exist. If someone else wants to work on an iOS messaging app that leverages the tech, more power to them. This let’s me in turn focus more on the part I find interesting and unique: the back-end. But I worry about the likelihood of it all coming together by the end of the semester.

The second issue I ran into recently is getting everything running. It turns out the stuff we are doing with wifi cards and monitoring for dangling packets is a lot easier in linux, meaning I had to try and get a linux machine running for myself. I didn’t want to do this on the school’s matrix server initially, because I wasn’t sure if I could or should be doing weird networking nonsense off of their machines. This meant falling back onto my usb copy of mint. Except it didn’t seem to work on my laptop. It worked on my desktop, but I couldn’t use it on there for other reasons. The time spent hunting for a linux machine just makes me lament the time crunch even more.

So, what did I end up getting done after all this research and linux-hunting? Well, I’m still no networking wizard, so I started to do some work elsewhere in the project. I learned how to implement command line arguments and started to set them up. While there is still the matter of hooking it up to something useful on the networking side, I took the opportunity to create a command line argument guide/template to aid any contributors (and probably future-me). Next goal: make the arguments actually worth using!

Advertisements

Software Optimization Project Plan (SPO600 Assignment Stage 1)

Over the next few weeks I will put what I’ve learned in Software Optimization and Portability to the test by attempting to speed up some open-source hashing software. The hashing tool in question is called cityhash, which seems to come from Google.It hashes strings, and is written in C++. I chose it because it seems to be legitimate, and claims decent performance, but  still doesn’t seem over-streamlined to the point where I am not capable of contributing anything. That may still be the case in the end, but I think the potential is there.

Benchmarking

I hope to build the project on two systems. My own computer (an x86_64 system) and one of the school servers of the aarch64 architecture for variety. Using the test code included in the package as a guide, I will make a large set of strings to increase the runtime. Using the profiling tool perf, I will measure the overall and function-specific run-times of a specific hash function (probably CityHash32 or CityHash64) and take the average over several runs. The same data set will later be used to check for speedups.

 

Optimization

There are a variety of options to explore when it comes to optimizing this function. The simplest way seems to be messing with compiler options, which I may play with, but the README suggests that they are already aware of the more general compiler optimizations available (such as gcc preferring “-g -O3” over the default “-g -O2”). The actual functions themselves seem too complex for me to significantly rewrite. I also find it likely that using in-line assembler will backfire on me, as I have read that except in very specific cases, it often leads to a slow-down.

Therefore, I have my eyes set on vectorizing the code. That is, to group similar operations and do them simultaneously rather than just one by one.  There are even notes in the code that suggest that no SIMD (Single Instruction Multiple Data) operations are done, so I think provides a good candidate for improvement.

There process of doing so may be quite difficult. I am reading up on vectorization, and a conversation with my professor suggested that process of vectorizing this code may take me on a bit more complex path than I intended. I am however up for the challenge, and will carefully learn about masking and using vector registers. Fortunately, I have until stage 3 to bring myself up to speed. I’m kind of excited to be getting my hands dirty with something like this.

Since the project comes with a testing suite, I will run the tests to make sure nothing is broken. I will also do a cursory check to make sure the outputs are coming out the same. To prove performance increases, I will compare both overall runtime and relative runtime for the function in question, on both systems. I will focus mostly on speed, and not memory or other resource dimensions, though if I stumble onto something of interest I will include it.

Building a collaborative project from the ground up (OSD600 0.3 week 1)

A New Project

With the start of a new assignment cycle, the time has come to start an open source project. For the 0.3 release of my open source development course, my classmates began brainstorming new ideas for internal projects that we could envision making for the remainder of the course. A friend of mine had several ideas, but one appealed the most to me: An application that used a device’s network card to send data directly to another network card, cutting out the middle-man (without internet, or even LAN). After telling him how much I liked the idea we pitched it to the class.

This DarkChatter  was born. In my eyes, the details are still a bit up in the air, however. Would this be a full blown chat application, or just message-in-a-bottle deal? Would it fully anonymous, or somewhat anonymous with persisting identities? Would it be a mobile application, or would the permission issues or required rooting make PC the preferred platform? (edit: it seems that this may be easier for an iPhone but I’m an android user).

I’m still not sure yet where it will go. But in all of this I know where my preference lies. I think I want to be closer to the lower levels on this. What is going on with the hardware, the packets, and what difficulties are there in doing this? Why is  this not already a more popular thing? Whether it is a chat app or not is not the biggest factor for me. However, first we need to get this off the ground, and get people on board.

For that reason I consider this a research week. Some of the choices that need to be made must be done by someone who is informed on the topic. As I am not an expert at networking or app development, I want to prime myself on the relevant topics.

Research Begins

The project’s original mastermind pointed me to Scapy  first. Scapy is a tool that allows you to craft packets in any way you want, even if it doesn’t seem to make sense. This does seem useful, since we are likely to be doing some non-standard networking wizardry, and this will allow us to customize things to our needs. Though it is based in python, which makes me wonder how to get it running in the context of a larger codebase in another language.

After reading a bit about Ad-hoc Networks and going down the rabbit hole to read about Mesh Networking, I realized that there could have many nodes communicating at once. However, I am probably still against group chatting in the interest of time and reliability. It also got me wondering as to the distance limits and how chaining would work. Could this be extremely useful in areas of high population density? Or would other people be slowing down my device by using me to pass data as a middleman? But, back on topic…

After looking up a list of similar applications, I found that Bridgefy  offers an SDK for communicating without Internet for up to 100 ft. It doesn’t however say that it is free, so I am afraid that it is likely not. The silver lining is that if they are able to deliver these services without jail-breaking anything, perhaps we can too. But the lack of existing open source projects in this realm is going to hurt, and require me to do much more research with less hand-holding. Hopefully I won’t be in over my head.

Hacktoberfest Retrospective (OSD600)

Over the past month, I have taken part in an event called Hacktoberfest, taking some first steps into the collaborative world of open source programming. The goal was to make 5 pull requests on github.com over the month of October, earning a T-shirt and credit toward my Open Source Development course. A list of Pull Requests and their associated Issues is provided below.

Issue 1        PR 1           Blog Post 1

Issue 2        PR 2           Blog Post 2

Issue 3        PR 3           Blog Post 3

Issue 4        PR 4           Blog Post 4

 

Unfortunatley, I did not finish all 5 Pull Requests. Nevertheless, I think the entire process was still quite positive. There were a lot of things I learned along the way:

 

  1. Open Source isn’t scary. (You can contribute too!)

It can be pretty intimidating to get started in the open-source world. For someone who doesn’t have much experience in building anything real  or useful, it can feel like there is always more you need to learn before you get to work on such projects. But it turns out, you can always find projects that need help, whether those contributions are large or small, they will be appreciated. If you make a mistake, things can be reversed, and people will help you understand what went wrong. Furthermore, people will hold off on solving certain problems so that a beginner might have the chance to use it as a learning opportunity. This makes everything more welcoming, and gives programmers of all skill levels the freedom to find their own way to contribute

 

2.  Get your life in order

It goes without saying that if I couldn’t finish all the required PRs, then something went wrong. It wasn’t a time management problem in the sense that I forgot the work or overlooked it, but in the end it is the results that matter. If I were more strict with my time for even one day, I think I could have avoided this issue. Even if you are feeling tired and unproductive, working at 30% capacity can be better than not working at all.

 

3. Don’t expect things to be as quick or as simple as you expect

Whether it is due a typo, or a deep misunderstanding of how the code functions, there will be times where you think requires 30 seconds or less of work. Complications can appear at every turn. The code rewrite might be just one line, but installing the package in order to run the build locally could take forever (if you get stuck behind a wall of errors). Making a small change might break things that it shouldn’t, and some things might be intentionally written a certain way for a certain reason. You might make what seems like an obvious upgrade only to find out that it needs to be adjusted. You might end up making 8 commits instead of 1, even though you thought it was so easy. Which leads me to…

 

4. The PR isn’t the end of the story

Although for this event, the goal was to make Pull Requests, it is possible that you may need to make adjustments before it is merged into the project. Although I knew this going in, it was surprising how even a minor change could lead to a long back-and-forth or significant rewrite of code. This turned into an amount of debt going forward, as  I needed to continue and find more issues despite not having fixed the Pull Requests that lay waiting to be fixed. There are still some that I intend to go back to, but it may take some time.

 

Having learned from this experience, there are a few things I would change if I had to do this all over again.

The first is finding an easier way to find the types of issues I wanted. I found it quite difficult to search when half of the issues in the search were on repositories for “easy hacktoberfest PRs”, or for code challenges. Although code challenges are not without value, a lot of repositories using the hacktoberfest label didn’t seem to be in the spirit of an open-source software celebration. Many repos seemed to be created to break the event’s rules, too. My only solution to this would be to craft better searches, which I began to do towards the end of the month.

The other thing I would do is maximize the amount of time I could dedicate to it. Although I didn’t need an incredible amount of extra time to work on it, I think if I had fewer courses and better management of sleep and energy levels, I would have a more positive experience. I have found that I have many interesting courses this semester, all of which have an incredible amount of depth to them.  Beyond just completing my work, I would have loved to  play around with things in several of them, including my work in open source. By getting deeper involvement on these projects, and with less changing of gears as I switch my focus back and forth, I could have grown more.

 

Conclusion

I really enjoyed taking part in Hacktoberfest, and I will likely continue to find projects to work on in my spare time. I do however, expect to be very picky when picking my projects. Working on various types of projects, and getting more experience with git make me feel like more of a programmer, and this first push has made it easier to continue later on.

Hacktoberfest Part 4 (OSD600)

Hacktoberfest continues! Finding issues to fix on github.com projects has been surprisingly challenging. The amount of startup time on getting involved with a project really depends on a lot of things. Large projects, either in number of files or number of people, or even small ones in a language you don’t understand take some research before you can get involved. That isn’t to say it is impossible, but I have found it very easy to dismiss projects on these grounds unless there was something bigger attracting me to a project and making it seem worth the burden.

This time my issue was writing documentation for Android JetPack, explaining how to use EmojiCompat. EmojiCompat allows android apps to use Emojis, including ones that would normally not render because the font package included with the operating system version does not include the newer Emojis that have been released. A video was provided with a rundown on how it is used, and I based the documentation on that.

While not technically challenging, I did find it difficult to describe what was happening in the mandated commenting of code examples. Perhaps I stuck too much to writing code and not enough prose.

Hacktoberfest Part 3 (OSD600)

Continuing with finding issues for Hacktoberfest , I came across an issue  on a repository  for a project that allows users to manipulate images in a way that resembles works of the artist Kensuke Koike.  In creating the project, the author decided to limit the size of user loaded images via cropping. Since most users are unlikely to want their images cropped, it was requested that this be replaced with image scaling.

I installed the project files, and necessary packages, however it didn’t build properly. So I ended up manually installing parcel-builder globally on my computer, which allowed me to build the project and test it out. Finding a large enough photo, I did in fact confirm that cropping was an issue and that it hadn’t secretly been fixed already. Creating a new git branch, I started to get to work.

My expectation was that I would have to find a new library, or to write up an entirely new method to get this done. However, it seemed that, bluimp-load-image, the library already being used already had this functionality.

The original code looked something like this:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
      e.target.files[0],
      imageSetup,
      {maxWidth: 2000} // Options
  );
};

After finding scaling methods in the library that loadImage was from, I found that it could be fixed with very few changes:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
      e.target.files[0],
      imageSetup
  ).scale(2000);
};

Getting ready to post my pull request, I paused for a moment. Scaling it down to a width of 2000 was the request, but is that what this was doing? If the image was smaller, would it be scaled up? That is probably a waste of time, as the limit was imposed to prevent slowdown in the first place.

So I replaced it with this:

document.getElementById('file-input').onchange = function (e) {
  loadImage(
    e.target.files[0],
    imageSetup
  ).scale({maxWidth: 2000});
};

Checking again that everything worked properly, I tested with a few more images and was confident that my solution was satisfactory.

Hacktoberfest Part 2 (OSD600)

Continuing my Hacktoberfest journey, I kept searching for new projects to contribute to. I did find a handful of promising prospects, such as mtgatracker, but I decided to hold off on starting on it until I could take a better look at how it works.

In the meantime, I found a small project that aimed to remake various board games, startin with Tic Tac Toe.  The issue I found was still relatively simple, but not to the point of being trivial. The task was to find places in the code where hard-coded symbols and magic numbers could be removed, thus making the code easier to maintain (and in my opinion, more readable). I wasn’t sure how hard I should be trying to do this, though. Were there cases where it is better leave as-is? Is there such thing as going overboard?

I began with the replacement suggested by the person who posted the issue. The character ‘.’ was used to represent empty squares that don’t yet hold an X or an O. I began by pulling it out and adding it in at the top of the file. I realized then that it was used in other files as well, so I had to move my defined constant elsewhere in the header file hierarchy. I tried to come up with a good name as well, so that the reading the code would flow well. That way it went from something like this:

c != '.'

to

c != EMPTY_SQUARE

I think it’s easier to figure out what the code here is trying to do this way, though it may be more difficult to see exactly how it is done (by comparing a character). With that done, I ended up replacing a few things with existing constants.

After that, I replaced a few more symbols, added a a use of an existing constant, and separated game states from -1, 0, 1 to human readable ones.

There was also a part where I thought I found a place to speed up the code (involving the ordering of conditions in an if statement. After some time, I realized that there would be no savings after all. All conditions were equally likely, so there was no exiting early by rearranging them.

Both before and after working on this issue, I tried the Tic Tac Toe game for myself. At first I thought it was broken, since it seg faulted whenever I put in my commands (separated by commas). It wasn’t until I realized that they needed to be separated by a newline instead of a comma that I got it to work properly. This could be either a failure in communicating to the user the expectation (shown as (row, column) in the instructions) or a failure in accepting a larger variety of user inputs. Either way, I may end up raising the issue myself, but only after confirming that it is one.

I submitted a Pull Request and sent it out to be judged. I still have to go back and make some changes to the pull request I made for part 1, but I may end up finishing that up later this week.