-
Notifications
You must be signed in to change notification settings - Fork 1.4k
How to read raw binary without definition, and re-write to binary? #736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You could reverse engineer the definition. It's not that hard actually and once this is done, you'd not be limited anymore. Alternatively, there is the low level API for working with the wire format (example) that could also help you to identify the format. |
I find your example very intersting and I will continue that road! So far I am analyzing this first part of a buffer: 0a df 11 32 9e 05 08 02 12 1c 0a 09 62 72 6f 77 73 65 5f 69 64 12 0f 46 45 77 68 61 74 5f 74 6f 5f 77 61 74 63 68 ... But I struggled after some parts... hope you bear with me. From what I understood: // 0a = 10dec = 0000 1010 = msb: 0, id: 1, wiretype: 2 My conclusion is: wiretype 2 ldelim with 2271 length / bytes. So thats why I do:
here is console.log from the reader.string() "������ which looks not correct. Parsing the same buffer with protoc.exe --decode_raw < buffer returns: 1 { So expect I do miss something in the interpretation. Is the string by chance nested and I have to apply the same process on the return from string() ?? How can I determine if its proto v2 or v3? Very glad for any feedback from you! Cheers, Markus |
Looks like it's not just bytes, but submessages, so ...
Yep, but it's rather a buffer than a string.
looks like a sub-message (also corresponds to what protoc outputs: note the regarding protoc output, this continues. message structure is about:
etc. As you see, protoc's output is a good indicator of the field ids to expect. It also indirectly shows possible data types (strings, submessages with braces, but numbers could be varints or fixed32/64 bits).
You cannot. proto3 wire format does not differ from proto2, it's just the field declarations that are all implicitly |
that makes sense. I assume, this buffer uses V3, because nested in V2 would have wiretype 3 or 4, no? |
No, wiretype 3 and 4 are for legacy groups, a feature long deprecated in proto2 already. On the wire, proto2 and proto3 do not differ much, it's mostly language-level changes like all optional fields and new data types, but those new types use backward compatible encoding. |
For a project I need to read a binary without having its proto definition. Using protoc.exe from Google does print me out something readable, but further more I need to change specific content and than re-write the content to binary back.
Any general advice? Would I need to dive deep in the protocol to understand how to decode manually?
Or would you suggest using protoc.exe output, transform to lets say JSON, and rewrite it (with a somehow reverse-engineered proto)?
I am not necessary stuck to protobuf.js or any particular technology.
Any general advice is super welcome!
The text was updated successfully, but these errors were encountered: