ELF Object files can have either or both a ProgramHeader and SectionHeader that
describes how to locate the .dynamic entries in the file. We try the SectionHeaders
first, and if there are none (a loadable-only ELF object) then we check for it in
the ProgramHeaders. It is also Ok for there to be no .dynamic entries in the file.
Now, all section parsing is done lazily on-demand by the other File methods
Also, update README.md and lib.rs doc comment to reflect current library development state
These use the lazy parsing methods to read the section headers and section data when requested
as opposed to reading and parsing the section headers and section data all up front when the File
is created.
This changes ParseError into a variated enum which allows users (and tests) to
easily match on specific error conditions.
It also encodes helpful associated data into the various enum variants in order
to provide more helpful error messages for the various error conditions.
This pattern of enum error construction also defers string formatting to the
user rather than forcing an expensive format! when the error is generated.
This also properly implements std::error::Error's .source() as requested in #15
This also exposes ParseError in the public interface as requested in #20
This enables the two use-cases:
1. Parsing ELF structures out of a &[u8] that contains the whole file contents
2. Parsing ELF structures with on-demand I/O out of a Read + Seek through
the CachedReadBytes
Currently, the File still allocates its own internal vec<u8> buffers in which
to copy the section data.
This will allow us to change section parsing into on-demand lazy
parsing as opposed to always reading and parsing all of the sections
up front as part of open_stream().
This avoids the up front Vec allocation and parsing in favor of lazily parsing
ProgramHeaders iteratively out of the segment table bytes that were read from
the file. I opted to read the whole segment table up front with the thought that
one larger for the whole table I/O is likely faster than lots of small I/Os to get
each phdr, though that assumes that the segment table isn't huge (multi-MBs) which
is likely the common case.
This wraps a Read + Seek for our lazy file i/o use-case and allocates and caches
in-memory buffers in which to read file data which can then be used by
ParseAtExt when parsing ELF structures.
ParseAtExt is an extension trait on byte slices which can parse uints
of various sizes in an endian-aware way. This will be used for all parsing
uses.
ReadBytesAt is a trait which will allow us to provide byte slices
from which to parse ELF data in two different ways:
1. directly on an already read byte slice (the user reads the whole file
contents up front)
2. in a streaming method where the user provides a <Read + Seek>.
The lifetime elision rules make it such that we don't need to explicitly state
these, since the only criteria we're trying to maintain is that the lifetime
of method outputs are the same or shorter than the lifetime of &self.
These contents can be huge and it is not clear what a useful representation
of an entire ELF file would be that would be usable by any given user of
this crate. Users can implement and define their own string interpretation.
My thought is that this trait method will be able to drive the difference between
the two desired parsing strategies:
1. All I/O is done up-front by giving us the full object file contents in a
&[u8]. This could drive a zero-allocations parser which lazily parses ELF
structures from byte slices on demand.
2. Lazy I/O done on demand by giving a File stream to some sort of ReadAtExt
trait interpreter which internally allocates Vec::<u8> on demand for requested
read_bytes_at requests and caches the results internally. This tradeoff uses
allocations but can reduce the amount of i/o performed for users who only
want to inspect small subsets of the file contents.