[Testing Needed] Reducing Robot Boot Time#3745
Draft
suchirss wants to merge 3 commits into
Draft
Conversation
… by 0.5 seconds on restart
…ng to network. Previously started running after already connected to network.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
An issue we have during competition is that robots take too long to reboot during matches. This results in idle time while we wait for boot.
After investigating why this might be the case, I found two likely bottlenecks:
Context;
systemdas their service manager.systemdconfigures certain services on boot..servicefiles to configure custom services that will be managed bysystemdthunderloop_mainon the Raspberry Pi.Problem:
thunderloop.servicespecifies that thethunderloop_mainshould wait for the network to be online; specifically it is blocked by thesystemd-networkd-wait-online.servicewaitForNetworkUpservice.Solution:
thunderloop.serviceso that it does not require the network to be online. On Pi boot alone, this cuts ~10 seconds. Ideally, this should mean that our robots should be functional ~10 seconds faster on reboot with the changes made here. This is what needs to be tested.systemd-networkd-wait-online.servicealtogether. This can be baked into thesetup_pi.ymlansible file.thunderloop.cpp*Note: this PR doesn't fix this problem. If boot time is acceptable with the solution to Problem 1 above, this second problem does not need to be addressed prior to robocup.
thunderloop.cpphas setup functions that block one another - this adds to boot time before our robots can step into the main loop. One of these blocking setup functions iswaitForNetworkUp, which blocks the thread until the network is up. Ex: the power and motor services are blocked bywaitForNetworkUpeven though they don't require the network. This adds to our boot time. My suggested fix: use the four cores on the Raspberry Pi to run setup tasks in parallel, or use multithreading so setup services likewaitForNetworkUparen't blocking, or use a combination of both.Testing Needed
As mentioned above, this PR only addresses the blocking behaviour of
thunderloop.servicesuch that thunderloop_main does not wait onsystemd-networkd-wait-online.service. On the Raspberry Pi alone, this resulted in a ~10 second time save on startup. Your (the tester's) job will be to check if this time save translates from "time saved on Pi boot" -> "time saved from reboot to when robot is functional".When I took on this task, we did not have any functional robots. As such, I don't actually know how long it takes from reboot → functional robot.
thunderloop.cppsetup functions are complete and the robot can accept messages and use them to move around. Your definition for what qualifies as a functional robot may depend on what level of working robot we have at this stage.I would suggest testing in the following incremental steps:
systemd-analyzetool to see how long boot takes. These commands in particular are very useful:Resolved Issues
Length Justification and Key Files to Review
Review Checklist
It is the reviewers responsibility to also make sure every item here has been covered
.hfile) should have a javadoc style comment at the start of them. For examples, see the functions defined inthunderbots/software/geom. Similarly, all classes should have an associated Javadoc comment explaining the purpose of the class.TODO(or similar) statements should either be completed or associated with a github issue