Other interface methods on ElfBytes want to use the phdrs, and there's
no need to discover and validate the table each time its needed.
This now performs that discovery and validation once with initial FileHeader
parsing, then simply clones the lazy ParsingTable on request, which is cheap -
its just copying a slice, not copying actual data bytes.
Most of the other interface methods on ElfBytes want to use the shdrs, and there's
no need to discover and validate the table each time its needed.
This now performs that discovery and validation once with initial FileHeader
parsing, then simply clones the lazy ParsingTable on request, which is cheap -
its just copying a slice, not copying actual data bytes.
The more I work with these, the more they feel like they should just be two distinct types
with their own interfaces, where the ElfParser trait really only serves to complicate the
implementations and get in the way.
The ElfBytes impl is so much simpler than the stream one due to
its ability to get work through multiple shared references!
The more I work with these two, the more they seem like they should
really just be two distinct types rather than trying to each impl the same
Trait. Maybe long term the ElfStream interface will diverge to provide
an interface that's easier to code and work with, with some lazy parsing
but also some allocate and parse-it-all to simplify the implementation, since
simpler (usually) means less bugs!
section_data_as_notes(): yields a NoteIterator for a given SHT_NOTE section
section_data_as_rels(): yields a RelIterator for a given SHT_REL section
section_data_as_relas(): yields a RelaIterator for a given SHT_RELA section
section_data_as_strtab(): yields a StringTable for a given SHT_STRTAB section
This gets the section data from the file and an Optional CompressionHeader
if the section's data is flagged as being compressed.
Returns an empty slice for SHT_NOBITS
These don't need to be generic or exposed outside this file, now that these are
two distinct types. It's also more convenient to have get_bytes() return a Result
rather than an option that has to be interpreted everywhere.
There is an unfortunate complication with the current File impl, where bytes-based parsing methods
require a `&mut File` due to the Stream-based impl mutating its internal buffer cache. This is overly
restrictive and prevents users of the Bytes-based impl from doing otherwise safe things like
concurrently getting multiple lazy-parsing types for different pieces of the ElfBytes and parsing
from them in tandem.
The goal of this split is to allow ElfBytes to do just that - yield multiple different lazy-parsing
handles at the same time, while also allowing a more restrictive mutating ElfStream. Note that the
ElfParser trait is implemented on a shared reference for ElfBytes but an exclusive mutable reference
for ElfStream.
These will eventually usurp and replace the existing File interface.
This has four impls:
AnyEndian: Can be used to parse integers from byte-order from at runtime
LittleEndian: Can be used to always parse integers from little-endian order
BigEndian: Can be used to always parse integers from big-endian order
NativeEndian: Can be used to always parse integers in the configured target's native endian order
When using the more restricted impls (like LittleEndian), a ParseError::UnsupportedElfEndianness will
be returned if the user attempts to parse a BigEndian ELF file. When using the more restricted impls,
the integer parsing code gets optimized to skip the conditional dispatch for the appropriate endianness
conversion method, which can be useful for uses where you know your binary only wants to target
a fixed endianness.
Currently, the File::open_stream() method always uses AnyEndian, meaning it can parse either Big or Little
endian files. Future patches will expose the new impls.
This has three impls:
AnyEndian: Can be used to choose how which byte-order to parse integers from at runtime
LittleEndian: Can be used to always parse integers from little-endian order
BigEndian: Can be used to always parse integers from big-endian order
The long term goal of this trait and these three impls is to use them in the ParseAt trait
in order to allow users of the library to decide whether or not they want the overhead of
AnyEndian (the run-time match statement to pick which endianness to parse as.
I kept this as pub(crate) for now, as I'm not sure if the naming is confusing for users or not,
so I want to keep it out of the public interface for now.
This bundles together all the checked integer math to calculate the range with the
desire to keep File impl code as small as possible. My hope with this is to be
able to reasonably split the ELF Bytes and Stream interfaces into two distinct types
each with their small bits of code in order to allow the Bytes interface to work
via shared references rather than mut references.
New Features:
* Add fuzz targets for parts of our ELF parsing interface via cargo-fuzz
* Add SysVHashTable which interprets the contents of a SHT_HASH section
* Add StringTable::get_raw() to get an uninterpreted &[u8]
* Add ParsingTable.len() method to get the number of elements in the table
* Add some note n_type constants for GNU extension notes.
* Add default "to_str" feature to get &str for gabi constant names
Changed Interfaces:
* Change File::segments() to return a ParsingTable instead of just a ParsingIterator
* Change File's SectionHeader interfaces to provide a ParsingTable instead of just a ParsingIterator
* Remove deprecated File::section_data_for_header() in favor of File::section_data()
* Remove FileHeader wrapper types OSABI, Architecture, and ObjectFileType
* Remove ProgramHeader wrapper types ProgType and ProgFlag
* Remove Symbol wrapper types SymbolType SymbolBind SymbolVis
* Remove wrapper type SectionType
* Remove unhelpful SectionFlag wrapper type
* Remove Display impl for FileHeader, SectionHeader, ProgramHeader, Symbol
* Remove ParseError::UnsupportedElfVersion in favor of more general ParseError::UnsupportedVersion
Bug Fixes:
* Fix divide by zero panic when parsing a note with alignment of 0 (Error instead of panic)
* Use checked integer math all over the parsing code (Error instead of panic or overflow)
* Fix note parsing for 8-byte aligned .note.gnu.property sections (Successfully parse instead of Erroring)
* Add size validation when parsing tables with entsizes (Error instead of panic)
This was caused by the alignment modulus calculation when fuzzing gave a note section header with zero-byte alignment
"attempt to calculate the remainder with a divisor of zero"
I kept this as pub(crate) for now, as I'm not sure if the naming is confusing for users or not,
so I want to keep it out of the public interface for now.
My plan is to use the correct width of the corresponding ELF types when parsing ELF structures
and then converting them into usize as appropriate when those fields are being interpreted
to locate other elf structures within in-memory buffers. For the common case these days
of 64-bit machines with 64-bit usizes, these conversions will all succeed. For 32-bit
machines, this conversion means that the library will not be able to parse large 64-bit
files. When that happens, though, the library should helpfully return an error instead of crashing.
These same sorts of changes will need to be made throughout the library in order to
harden it against crashing due to integer overflow math (often due to corrupted files).
Fuzzing catches this sort of thing really quickly.
Also, introduce two new resulting ParseError types:
TryFromIntError and IntegerOverflow