GSoC#9: It's already time?

This is it.

I triggered a discussion in MacPorts which led to a meeting involving the whole community. I'm so noob rn I get goosebumps at such a statement:

"If we start at 15 UTC or later, we could hopefully cover all time zones from the west USA to India, including even China in the evening."

With this, the GSoC period is officially off to an end and what has been (and will continue to be) a wonderful journey.

Now, I await the final verdict.


EDIT: Passed. :D

GSoC#8: Project and pre-final evaluation period

It's nearly over and I don't like it.


Effectively, my project had three courses of actions to implement - snapshot, migrate and restore. Out of them, migrate and restore both depended on the snapshot action largely and also, I divided each of these actions into procedures or simpler steps further as you probably know from my previous emails throughout the term. I have the following points to make. I welcome all kinds of comments.

1. snapshot action is entirely finished for now.
  • All the Tcl and C functions for creating a snapshot, fetching a snapshot given its id, fetching a list of recent snapshots, fetching properties of a certain snapshot have been finished.
  • This required me to create a new entity named reg_snapshot and wrote all the helper functions as well in util.c as were there for entry, file, portgroup etc.
2. restore action is also finished for now.
  • It takes an optional argument as 'port restore --snapshot-id <id>' if you want to pass in a certain snapshot or without the argument, it lists the snapshots on the console and asks the user to select an id and proceeds.
  • It first deactivates all the currently active ports and then reproduces the install commands for the selected snapshot. NOTE that it doesn't uninstall the original ports, as desired for the minimal build.
  • But before deactivating, it first sorts the portlist in a way that dependencies come later. 
  • And also, before installing the selected snapshot, it first sorts the ports from the snapshot in a way that dependencies come first. 
  • I have used "registry::run_target" directly for doing all this and tried testing by modifying the registry.db locally with many of the scenarios I could think of.
3. migrate action had essentially four steps - a) creating a snapshot, b) uninstall the current ports, c) upgrade port command and d) restore the last snapshot.
  • Out of these 4, creating a snapshot and restoring are simply imported from the respective modules.
  • I have finished the procedure for uninstalling the ports and verified.
  • The one task that is left to commit is to cover, if even possible, all the scenarios where we need to look for upgrading port command, that is, basically check for the change in OS or arch in macports.conf.
  • While I have added the most simple checks for it like here [0], there was a wonderful discussion thread on it by Mojca [1] and Clemens [2] yesterday covering many good cases like a change of compiler, for one or change in libraries by Apple, for two. This is what I am working on as informed by my last update.
4. I have added most of the documentation for all the modules, functions and any important steps, listed any minor TODOs in the code itself, added ui_msg statements, cleaned the code etc.

5. Apart from this, I have been trying to get involved more in the community on the dev mailing list and as Ken pointed out that writing a portfile is not a task too difficult, I have been looking at portfile tutorial as well.

6. Future Plans related to the project:
  • First, finish the part of migrate action left.
  • Work on the test cases for all the three modules.
  • On mportinit, we should suggest the user run `port migrate` instead of (or in addition to) the link to migration guide.
Attachments: Link 1 Link 2 Link 3

GSoC#7: Global variables are bad.

I couldn't be happier to see this, after eight hours of struggle:


Why did I get stuck? Because I'm used to global variables in C from lower division classes at my university. I always tried to escape the pain of passing by reference or address or value and simply use a global pointer. But when it comes to large code bases and across modules, like the macports one, it's not possible anymore. How often someone passes a pointer by address?

It was a good lesson. The next challenge is to pass this struct to Tcl.

GSoC#6: Project and pre-second evaluation period

Unlike the previous one, this was a more productive period. There were some important discoveries I made about how the port command works and how the macports-base is written to serve the Tcl and registry APIs for non-core modules.

I'll finally update you on the three phases of my project in a more formal way.
  1. A snapshot is a list of all commands that created the current installed state.
  2. Restoring a snapshot deactivates the active ports and reproduce the install commands for the selected snapshot.
  3. Migrate creates a new snapshot, uninstalls installed ports and reproduces the install commands for the last snapshot.
I described the snapshot in great detail in the last post. There have been changes proposed to it when I was confused about how to get the whole snapshot action behave as a single transaction and Rainer pointed out the need to have the snapshot logic begin at a higher level, probably using Tcl wrappers.

Second Phase: In all, the action `migrate` aims to create a new snapshot, uninstalls all ports, upgrades the port command for the new architecture and finally restores the last snapshot which involves installing all ports from the snapshot we created before uninstalling. Till now, I have finished the following procedures.

  • uninstall_installed to uninstall all the currently installed ports.
  • recover_ports_state to install and activate (according to the snapshot) the ports to bring out the state as close to before migrating.
  • sort_portlist_by_dependents and port_dependencies to get the dependents and dependencies in the topologically sorted order in order to avoid installation of broken ports.
There is a great deal of project left for the next phase. At a higher level, I still haven't written for fetching the snapshot from the database, needs rigorous testing and error handling. From the previous phase, the above-proposed changes are still left.

The next phase will include finishing the migrate action, connecting it to the snapshot action but majorly, the implementation of restore action. I'll very likely need to take the help of my mentor in the coming phase.

In this period, I have also tried to improve on updates for the community with all the issues, leftovers and other points separately laid out. This helped to bring out the best in the discussion.

Find the work log here.


GSoC#5: A few words on Tcl (and an advice!)

After spending a couple of hours reddit-ing and other articles, I found some really interesting insights about Tcl which otherwise you might not never know about a programming language. So the following are some of the facts about Tcl which I found interesting and just stating them here:

To quote Philip Greenspun, as a software developer, you're unlikely to get rich. So you might as well try to get through your life in such a way that you make a difference to the world (like every startup shouts "make the world a better place"). Tcl illustrates one way:
  • make something that is simple enough for almost everyone to understand,
  • give away your source code
  • explain how to "weave" your source code in with other systems
Looks something we can do? Yesss. But do we? Noo.

Tcl rocks! Well, everything IS a string, like in a fundamental way. Just look at the tape from a Turing Machine which is just an infinite mutable string. This is really, really true. Like, when you do something like a for loop, you're basically running a "for" command and passing it four strings.
 for {set x 0} {$x<10} {incr x} {  
   puts "hey"  
 }  
is equivalent to:
 for "set x 0" {$x<10} "incr x" "puts \"hey\""  
:o :o

Numbers and strings being interchangeable actually works. Everything being a string, is damn powerful but takes a while to get into the mindset that you are not just coding the solution up but programming the programming to the solution. 

Oh, and the "uplevel". "uplevel" is a terrifying and baffling feature. It gives you access to the local variables of any function that called you.. Say, whaaat?

Tcl : scripting languages :: C : compiled languages.

 As jaymaj21 comments on reddit, "simple small footprint, least hairy implementation." One of the most efficiently parsed scripting languages, Tcl is simple and way easy to implement. You can just write up your own Tcl interpreter very quick. You can write a Tcl program that pokes around the run-time environment, for example, using "info exists x".

A language that is not powerful by itself can become powerful if a procedure can invoke the interpreter, that is, if a program can write a program and then ask to have it evaluated. In Tcl, you do this by calling eval!

Tcl is more of "Tcl, C Library" because you can access about everything from C in a way that's pleasant to use. 

God was I lucky to get a Tcl project for my GSoC. Finger lickin' good!

GSoC#4: Project and pre-first evaluation period

My project is about adding support to automate the migration process, in particular, after an OS upgrade or hardware change. So, from a set of manual migration instructions, the project aims to narrow it down to `port migrate`.

The project is divided into three phases: getting the information (in our database), removing the (existing) information, restoring the (updated) information. I know, it doesn’t seem to be a really efficient logic but we plan to improve it further as we go.



First Phase: The action `snapshot` aims to replicate the user install commands. The plan was to implement this using database and not files, so it justifies the need to add support for the tables to the entire stack up from cregistry.
  • The snapshot action basically takes a complete copy of all the installed ports and the metadata. Till now,
    • 3 new sqlite tables have been added to the registry’s create and update function and regsitrydb version renewed so that existing installations get them too.
    • snapshot_create in registry2.0/entry.c enables access to the sqlite database from Tcl frontend using registry APIs. It calls cregistry/entry.c reg_snapshot_create internally.
    • reg_snapshot_create creates a snapshot in the snapshots table, calls snapshot_store_ports.
    • snapshot_store_ports stores info on all currently installed ports in the snapshot_ports table (and requested flag)
    • snapshot_store_port_variants stores port variants with the sign in the corresponding table. It depends on getting parsed variants which I have not written yet (reason below).
  • Time spent on figuring out things initially counts as well, right? Only if this does not become an endless excuse.


Talk about Future: Go according to the Timeline mentioned in the proposal. Quickly finish the left outs in this 4-day planned buffer and start with the migrate action. Things seem to be picked up fast compared to the initial weeks. I’m now more aware of where lies what, hopefully.


I was stuck at.. 
  • for a long time, Tcl or C? In what language do I have to write the above three functions, Tcl or C? Believe me, this is not something you ask your mentor or community or perhaps, anyone. Then I started looking at the implementation of existing commands, one by one, took 3 to 4 days, and then one fine morning, I saw the mapping array from command to function. It was there, right at the bottom of entry.c. Yes, if you give this argument to registry::entry, this method will be called. Had a sound sleep that afternoon.
  • for a short time, probably for a day, that there exists something like reg_entry_propget to help you see through dark, to get what you want, just give the key. Should have spent more time going through C files than Tcl ones during community bonding period :/.


I am stuck at ..
  • Only one variant of a port can be active at a time. Makes sense. Running `port install apache2 -preforkmpm +workermpm` and then `port install apache2 +openldap` saves apache2 as two different ports in the registry. So, reg_entry_propget only returns one variant of a port when passed as reg_entry pointer. Now, how do I link the two variants installed (note installed and not active) to the same port in my database? If not, all we are left is with one variant mapped to “each” port, questioning the need for snapshot_port_variants table.
  • Till ^ gets resolved, I won’t be able to write get_variants_parsed function.
  • IMPORTANT: where to add the OS check? I tried starting a discussion on this but never arrived at a decent conclusion.


Still not clear ..
  • where to include the --list for listing snapshots? if with `snapshot` action, then it creates the need for --create which I do not prefer personally. But if with `restore`, then `port restore --list` doesn’t appeal as well. Only if a command existed which lists the choices first and then proceeds so that I could simply copy their design strategy and save myself from confusion every time I decide to do things nicely.
  • limiting the number of snapshots that a user can take?
  • and more to come :).
Screenshot if someone enjoys looking at it, I just like the shutter sound:




I like this one:


Also, I forget, the unreviewed commits are here.




EDIT:
A doubt, so fundamental to my project, that got cleared while writing this post:
  • We’re solving for these two end use cases (correct me if I'm wrong):
    • running simply `migrate` is (snapshot the current installed state + restore).
    • `snapshot` followed by `restore`: restoring back some existing snapshot.
  • Interactive restore could be a solution for --list?

GSoC#3: A 'miss'-communication

This happened in the very first week of the coding period of my GSoC. I had things in mind to talk about but they just remained in my mind forever until I received an email that scared me out of my wits and condemned me for not communicating at all for the last ten days. And it was true. 

Failing your evaluations just because you are not communicating properly is like not accepting help from the rescuers but waiting for God to come to rescue you while you're drowning.

Communication is the key. How much did you think about the project over the last week? How much effort did you put into thinking about your proposal? What does your research say? They need to know and you need to tell. It becomes difficult when it's remote.

Think of it as an office internship in some company. Not communicating at all means you go to the office but do not speak at all or engage with anyone and just watch around. Isn't this when we start calling the person weird and alien? People have fought for the freedom of speech and now, you're just being shy. Stupid!

Know how to sell yourself. Research well before making your points but if it takes time, then throw a mail to let the community know that you ARE researching and have the following updates to share. They don't know if you were staring at a blank screen all the time or even existed for that while. Play on your strengths and work on your weak points.

The community is not interested in the project code or the small fixes you make but in your approach to solving larger problems. The bigger things are more interesting to you and to the organization. What is it that you bring to the table? What would you do to improve the project beside the small fixes? Think about how you would spend your time this summer. 

They are interested both in your technical skill, but also your holistic understanding of the project. Ultimately, they will judge you by the thoughtfulness of your proposal.

GSoC is just another way Google provides you to connect with the people around the world. Review each other's code, your mentor's code each day. Think of it as a score for your personal development. You are not given this opportunity to complete one project and just get done. You failed if that's what you plan to do, even if you pass the official evaluations. You are given a life long opportunity to be a part of a community having members from all around the world and all you did was write five hundred lines of code. Ha!

To quote my mentor, "Communication is the foundation of a team. It is as important as any code you ever write. It is unlikely you will focus on building things only for yourself and success in building with and for others will largely depend on your ability to communicate. Always work at refining your communication skills. They will pay off in many dimensions of your life."

There are absolutely no limits to what someone can do! It's just how badly you want it.

Good luck!

GSoC#2 It begins... Community Bonding!

Now that Google Summer of Code 2017 has officially begun, I am not sure what to do.

Well, I had a lot of plans for this period in my proposal but they just seem trivial as well as confusing now, reading code and all that stuff. So most of the community seem to pass by and I lost hope. Who can even dare to go through a minimum of three to four files having more than 5000 lines of code?

Finally, a video on Introduction to MacPorts Base recorded by Clemens last MacPorts meeting came to rescue which gave the overview anyone needs to find and fix things in base. Basically, the video is a tour through MacPorts base hopping between port client, macports1.0, Tcl slave interpreters and C code for database access and to deal with system settings like proxy.

My notes (in raw form) while watching the video:

  1. port.tcl - port commands and corresponding procedures.
  2. macports.tcl -
    • mportinit - 
    • mportlookup - sources (default, ... ), PortIndex
    • mportopen - run the PortFile, does some caching as well, also, work done in a slave interpreter ($workername), aliases of macports APIs for use by portfiles
    • mportexec - takes mport returned by mportopen, gets the slave interpreter instance from mport
    • Example - portfetch, elevateToRoot and dropPrivileges, curl as a Tcl extension
    • functionality for command line users to keep it simple and easy to write port clients vs. complicating it for UI
  3. port1.0 - *portfile context*, eval_target, target_run
  4. package1.0 - downloading pre-compiled packages, verifying and extracting files
  5. darwintracelib1.0 - implementation of the trace mode, loaded into processes while they build
  6. machista1.0 - mach-O parser used by ref upgrade, SWIG
  7. programs - daemondo for launchd scripts
  8. CFLib1.0 - core foundation, bridge between Tcl and others, cocoa APIs, getting proxy settings (redundant)
  9. tclobjc1.0 - bridge between Tcl and Objective C (redundant)
  10. tests
  11. registry2.0, cregistry (C API for the registry database), port activate and deactivate, multiple versions and revisions of ports
There's no point in expanding the above into paragraphs. They are more like keywords. To the left of hyphen are the modules present in macports-base src directory. I know this isn't any interaction with the community but helps me at least know what to ask. Also, by now, I have got commit access to MacPorts's Git repository and a MacPorts @macports.org handle :D!

I'll be talking about my project in the next post, soon.

GSoC#1 Sounds like Summer!

After waiting anxiously for the results and refreshing the page hundreds of times to see, if, by any chance, the results were out before 2130 UTC, officially I'm in!!!

It seems like yesterday when I was an open-source newbie on IRC, with a little knowledge of programming in C and a bit of Python learned through university courses. Little did I know that writing programs is just the Prolog of the process called software engineering. Looks like the summer is going to be a lot of cmakes and compilation errors.

I will be working as a student developer under Google Summer of Code (GSoC) for The MacPorts Project. I would like to thank pixilla, cal, and ijackson for helping me get started, being patient throughout my doubts, and giving feedback on my proposal.

                                               GSoC'17MacPorts


I'll try to post updates over here every 2 weeks and try to document as much as it's feasible for me. So, watch out for more about the project.

My GSoC: MacPorts proposal can be found here.