Skip to content

Rebuild networking to use TCP vs UDP#6

Open
dmssargent wants to merge 41 commits into
HagertyRobotics:masterfrom
dmssargent:feature-tcp
Open

Rebuild networking to use TCP vs UDP#6
dmssargent wants to merge 41 commits into
HagertyRobotics:masterfrom
dmssargent:feature-tcp

Conversation

@dmssargent

Copy link
Copy Markdown
Contributor

The UDP systems requires bonding of ports to a selected number, and if/when the app crashes the port may get leaked. The TCP system requires only one socket to run, (two, if multicast is enabled) one in UDP mode for listening for a multicast. The UDP is also using hardcoded strings for IP addresses in stead of relying on validated user input.

What I have done:

  • Re-build the networking components to use the TCP mode, instead of UDP
  • Re-factor out the old systems
  • Write a new data exchange protocol
  • Add options about simulation to the settings menu

What doesn't work:

  • Thread safety is apperently not so good (the Android App is throwing a ConcurrentModificationException, which shutdowns the server) EDIT: Not quite: according to Android App thread dumps I am getting caught in a dead spin when I reimplemented the storage queue with a proper object designed for this usage (the old invalidated the iterator in my foreach, when the data got changed from a different thread)
  • Console is overflowing - the amount of data being sent and received is causing the console output to noticeably slowdown the application, it is taking its sweet time to catch back up after running for more than a few seconds (right now, one minute of running takes the console writer ten minutes to do) Code to display full data is commented out, otherwise it has an adepquate speed now
  • The device list getter/setter doesn't work just yet (probably because of problem one) Is XML Deserializer working?
  • Commenting (a need to add quite a few of those)

What we need to do:

  • figure out what is happening when a Concurrent Modification Execption gets fired

  • see what happens when above is fixed see what happens when if it works

  • Log the packet data and only display the important details if necessary

  • Add a way of turing debug mode on/off*

  • Get and validate user input for an IP address if multicasting is off or fails*

  • Add an option to turn multicasting on or off*

  • Add more ways to Stop / restart the Simulator*

  • Fix any leaking memory (when the client is left running it caches everything until it gets processed) I have trimmed the memory usage of my side down from 2.5GB to 100MB (client) and 100MB to 16MB (my side) (that's what I get for forgetting to remove a section of the mirror protocol, plus having high data retention policy)

  • Figure out why the networking is acting buggy

    *since you are maintaining the interface, I would recommend you the do the following

I opened this early so I could get feedback from you and to request changes, it is not ready to be considered alpha yet

@dmssargent

Copy link
Copy Markdown
Contributor Author

I resolved the issue with the ConcurrentModificationException, so that a dead spin could show up.

@dmssargent

Copy link
Copy Markdown
Contributor Author

Dead spin fixed, plus a few minor touchups to identify now your code is throwing: org.xml.sax.SAXParseException: Unexpected token (position:TEXT ?@1:2 in java.io.InputStreamReader@376a700e). I will go see if the data is valid in the parser, or just the code running the XML is expecting a new format of XML.

@HagertyRobotics

Copy link
Copy Markdown
Owner

Thanks. I'll start working on merging it into a branch to try it out. I just committed a lot of changes trying to get the GUI working better along with some restructuring.

@dmssargent

Copy link
Copy Markdown
Contributor Author

Sorry, about any delays, any ideas about any problems or have suggestions?

@dmssargent

Copy link
Copy Markdown
Contributor Author

I am working out why the XML is getting corrupted by the exchange. How would the Device List exchange work best for you?

@HagertyRobotics

Copy link
Copy Markdown
Owner

Hi, I'm a little confused. When you were working on your new networking code, I was also restructuring some of the classes and also I changed the package names to ftccommunity. I then merged my new code into master. When you issued your 2nd pull request I took a look but saw that because I make so many changes that I had to do some work to merge it with my latest develop branch. I haven't had a chance to merge it because I was working on getting the new ftc_app release working with the existing code. I guess the XML parser stuff above is just added to your original networking code. My question is, when you are working on your networking code, is it also merged with my latest master branch?

@dmssargent

Copy link
Copy Markdown
Contributor Author

Some of the times, I am trying to stay somewhat recent, but I had to take a break (from coding) when you slightly before you started pushing, so I am trying to catch up. My side of Git went slightly nuts on this repo, so thats why there is multiple commits with the same comments.

@dmssargent

Copy link
Copy Markdown
Contributor Author

I am still figuring out why the client is not replying back to the server, but I improved/polished lots of things

@dmssargent

Copy link
Copy Markdown
Contributor Author

The bug is apparently within the server receive Non-Heartbeat type. I validated the following components:

Server

  • Server - send
  • Server -> Network Manger - send
  • [-] Server -> Network Manger - receive
  • Server - receive
  • Server - Heartbeat receive (strange things are happening here)
  • Encoder
  • Decoder

Client

  • Client - send
  • Client - receive
  • Client -> Network Manager - send
  • Client -> Network Manager - receive
  • Client -> Heartbeat receive
  • Client -> Heartbeat send
  • Decoder
  • Encoder

@HagertyRobotics

Copy link
Copy Markdown
Owner

Thanks. I've been working on making sure I understand the communication between an actual Legacy Module and the Phone Still trying to understand switching from read and write modes.

@dmssargent

Copy link
Copy Markdown
Contributor Author

I am going to try to re-read the Netty and Protobuf documentation, and see if our software mentor can spot the bug. If all else fails, I am going to need fresh to eyes to spot that bug.

@dmssargent

Copy link
Copy Markdown
Contributor Author

I seem to have fixed the problem bug, now we can see it gets stuck on init with waiting on the Simulator Device feedback. So, I probably missed something when I updated the simulator to the latest version, I designed the networking to be backwards-compatible with your code (as in the functions that handle specific things should have the exact same behavior as before). I did add the throwing of an Interrupted Exception to the functions that wait for user input, to allow safe termination. Can you specify an overview of how each component is networked, or evaluate your module code to see if I only partially changed something.

@HagertyRobotics

Copy link
Copy Markdown
Owner

Sure, I will look at it right now. After a lot of tracing with an actual device, I now understand more about what is going on in the modules. Thanks.

@dmssargent

Copy link
Copy Markdown
Contributor Author

Let me know, if you find the mistake. Do you want a layout of how everything works in the networking?

@HagertyRobotics

Copy link
Copy Markdown
Owner

yes, if it is not too much trouble. Thanks.

@dmssargent

Copy link
Copy Markdown
Contributor Author

The Network Manager class takes care of keeping track of every send, and the received data. The received data is getting dumped into the Network Manager, then gets sorted by a "process queue" thread that moves everything to the right place. The Server/Client classes are Runnable classes that taking care of the bootstrapping the communications, it also handles the configuration options. The *Handler classes take on creating sends, and receiving things. The Decoder/Encoder classes change the Data class to a ByteBuf (an extension of the ByteBuffer) , and it gets written to the data stream, and vice versa.
Sending:
NetworkManager.requestSend (builds the Data object) - Handler -> NetworkManager.getNextSend() -> Encoder -> Data gets flushed to the stream

Receiving:
New data appears in the stream -> Decoder -> Handler -> NetworkManager.add - processQueue - getNextMessage (callable) (-> getNextData (gets the data encoded (used ASCII to encode)

  • -> Next object gets called immediately
  • - This may or may not happen immediately

There should be a protocol.md that goes into the communcation

@dmssargent

Copy link
Copy Markdown
Contributor Author

Have you gotten anywhere? I have been busy trying to get 4 teams started on the learning the new platform.

…ve an issue where the sim data and brick info types were being used on the same data type
@dmssargent

Copy link
Copy Markdown
Contributor Author

I am thinking that I already need to do a major rewrite of my own code to increase readability and performance.

@HagertyRobotics

Copy link
Copy Markdown
Owner

Same here. Working on getting started. I have more time now. I'll see what I can do. I like what you did but I need to make some changes to merge it with the current code.

@dmssargent

Copy link
Copy Markdown
Contributor Author

Okay, though what changes need to be made to push it to the current code, even you get settled down with this keep this pull request open, so we can collaborate on the optimization and rewriting the things that need to be rewritten

@dmssargent

Copy link
Copy Markdown
Contributor Author

If you are still looking at this feed, can you rewrite and point me towards the MR USB communication protocols on the PC side? I can re-implement the hooks that you had previously to emulate devices, and rebuild the TCP stack to work correctly.

@HagertyRobotics

Copy link
Copy Markdown
Owner

Hi! I just saw your note. Yes, I would like to get back into the simulator. I need a few more weeks to see where we end up this season. Thanks for your interest.

@dmssargent

Copy link
Copy Markdown
Contributor Author

Sorry for responding 23 days after your reply. I'm busy with FTC and FRC as well, so my time is limited right now. But I should be free to help out sometime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants