328 Commits

Author SHA1 Message Date
Christopher Cole
3c8ec10195
Move ElfStream into its own module 2022-11-04 15:01:21 -07:00
Christopher Cole
0f41f2b015
Move ElfBytes into its on module 2022-11-04 14:58:53 -07:00
Christopher Cole
35d241f6f3
Change ElfBytes to validate and construct the SegmentTable at parse-time
Other interface methods on ElfBytes want to use the phdrs, and there's
no need to discover and validate the table each time its needed.

This now performs that discovery and validation once with initial FileHeader
parsing, then simply clones the lazy ParsingTable on request, which is cheap -
its just copying a slice, not copying actual data bytes.
2022-11-04 14:51:20 -07:00
Christopher Cole
b6681305e5
Change ElfBytes to validate and construct the SectionHeaderTable at parse-time
Most of the other interface methods on ElfBytes want to use the shdrs, and there's
no need to discover and validate the table each time its needed.

This now performs that discovery and validation once with initial FileHeader
parsing, then simply clones the lazy ParsingTable on request, which is cheap -
its just copying a slice, not copying actual data bytes.
2022-11-04 14:34:53 -07:00
Christopher Cole
6c261b885e
Split ElfBytes and ElfStream into their own interfaces by removing the ElfParser Trait
The more I work with these, the more they feel like they should just be two distinct types
with their own interfaces, where the ElfParser trait really only serves to complicate the
implementations and get in the way.
2022-11-04 14:04:33 -07:00
Christopher Cole
879d6b5366
Specify explicit lifetimes on the ReadBytesExt helper trait
This more accurately states the lifetime guarantees for how this is used.
2022-11-04 13:55:20 -07:00
Christopher Cole
62b35934e3
Extend ElfParser trait with section_data_as_symbol_table, symbol_table, and dynamic_symbol_table 2022-11-04 00:16:34 -07:00
Christopher Cole
798e28d156
Extend ElfParser trait with dynamic()
Gets a lazy-parsing iterator over the Dyn entries in the .dynamic section or PT_DYNAMIC segment
2022-11-03 23:42:46 -07:00
Christopher Cole
e58c7e9530
Extend ElfParser trait with section_headers_with_strtab()
The ElfBytes impl is so much simpler than the stream one due to
its ability to get work through multiple shared references!

The more I work with these two, the more they seem like they should
really just be two distinct types rather than trying to each impl the same
Trait. Maybe long term the ElfStream interface will diverge to provide
an interface that's easier to code and work with, with some lazy parsing
but also some allocate and parse-it-all to simplify the implementation, since
simpler (usually) means less bugs!
2022-11-03 23:24:31 -07:00
Christopher Cole
682bd99348
Extend ElfParser trait with segment_data() and segment_data_as_notes() 2022-11-03 22:44:06 -07:00
Christopher Cole
acd1e5c797
Extend ElfParser trait with a few section_data_as_... methods
section_data_as_notes(): yields a NoteIterator for a given SHT_NOTE section
section_data_as_rels(): yields a RelIterator for a given SHT_REL section
section_data_as_relas(): yields a RelaIterator for a given SHT_RELA section
section_data_as_strtab(): yields a StringTable for a given SHT_STRTAB section
2022-11-03 22:30:21 -07:00
Christopher Cole
4327ba9929
Extend ElfParser trait with section_data() method
This gets the section data from the file and an Optional CompressionHeader
if the section's data is flagged as being compressed.

Returns an empty slice for SHT_NOBITS
2022-11-03 17:28:41 -07:00
Christopher Cole
3bcd325e0a
Rework the new interfaces used by ElfBytes and ElfStream to get bytes
These don't need to be generic or exposed outside this file, now that these are
two distinct types. It's also more convenient to have get_bytes() return a Result
rather than an option that has to be interpreted everywhere.
2022-11-03 17:00:08 -07:00
Christopher Cole
15e476f449
Extend ElfParser trait with section_headers() method
This gets the SectionHeaderTable (if any) by itself (without any associated StringTable),
and properly handles e_shnum > SHN_LORESERVE.
2022-11-03 16:43:04 -07:00
Christopher Cole
cb3e087131
Add a doc comment example for from_bytes() 2022-11-03 15:40:05 -07:00
Christopher Cole
70e05c111b
First step of implementing a new ElfParser trait with an ElfBytes and ElfStream impl
There is an unfortunate complication with the current File impl, where bytes-based parsing methods
require a `&mut File` due to the Stream-based impl mutating its internal buffer cache. This is overly
restrictive and prevents users of the Bytes-based impl from doing otherwise safe things like
concurrently getting multiple lazy-parsing types for different pieces of the ElfBytes and parsing
from them in tandem.

The goal of this split is to allow ElfBytes to do just that - yield multiple different lazy-parsing
handles at the same time, while also allowing a more restrictive mutating ElfStream. Note that the
ElfParser trait is implemented on a shared reference for ElfBytes but an exclusive mutable reference
for ElfStream.

These will eventually usurp and replace the existing File interface.
2022-11-03 14:59:06 -07:00
Christopher Cole
9e9bbb84fb
Expand on the doc comments in src/endian.rs 2022-11-03 13:19:29 -07:00
Christopher Cole
7c72fb0f3f
Make ELF structures generic across the new endian-aware integer parsing trait EndianParse
This has four impls:
AnyEndian: Can be used to parse integers from byte-order from at runtime
LittleEndian: Can be used to always parse integers from little-endian order
BigEndian: Can be used to always parse integers from big-endian order
NativeEndian: Can be used to always parse integers in the configured target's native endian order

When using the more restricted impls (like LittleEndian), a ParseError::UnsupportedElfEndianness will
be returned if the user attempts to parse a BigEndian ELF file. When using the more restricted impls,
the integer parsing code gets optimized to skip the conditional dispatch for the appropriate endianness
conversion method, which can be useful for uses where you know your binary only wants to target
a fixed endianness.

Currently, the File::open_stream() method always uses AnyEndian, meaning it can parse either Big or Little
endian files. Future patches will expose the new impls.
2022-11-03 11:47:20 -07:00
Christopher Cole
f876a89b44
Change SymbolVersionTable::new() to take Options instead of Default-empty iterators
This more accurately and explicitly states reality, and also makes some generic typing easier
for us later.
2022-11-03 00:26:23 -07:00
Christopher Cole
a5aaa3330e
Move validate_entsize onto ParseAt off from ParsingTable
This helps us out for future work where we make ParsingTable a more complicated type
2022-11-02 23:13:09 -07:00
Christopher Cole
1783b99aa1
Add a new impl of an endian-aware integer parsing trait
This has three impls:
AnyEndian: Can be used to choose how which byte-order to parse integers from at runtime
LittleEndian: Can be used to always parse integers from little-endian order
BigEndian: Can be used to always parse integers from big-endian order

The long term goal of this trait and these three impls is to use them in the ParseAt trait
in order to allow users of the library to decide whether or not they want the overhead of
AnyEndian (the run-time match statement to pick which endianness to parse as.
2022-11-02 22:16:33 -07:00
Christopher Cole
5de81cae98
Add gabi constants for SHN_ABS and SHN_COMMON 2022-11-02 16:56:37 -07:00
Christopher Cole
6b0974f6fa
Define PT_GNU_PROPERTY constant in gabi.rs 2022-11-02 15:24:05 -07:00
Christopher Cole
a30814d861
Dedupe some safe phdr.p_offset + phdr.p_filesz calculations with ProgramHeader::get_file_data_range()
I kept this as pub(crate) for now, as I'm not sure if the naming is confusing for users or not,
so I want to keep it out of the public interface for now.
2022-11-02 14:58:16 -07:00
Christopher Cole
f46a28c22d
Refactor internal impl of File::segments() to use FileHeader.get_phdrs_data_range()
This bundles together all the checked integer math to calculate the range with the
desire to keep File impl code as small as possible. My hope with this is to be
able to reasonably split the ELF Bytes and Stream interfaces into two distinct types
each with their small bits of code in order to allow the Bytes interface to work
via shared references rather than mut references.
2022-11-02 14:45:54 -07:00
Christopher Cole
aa41e552ed
Bump crate version to v0.6.0
New Features:
* Add fuzz targets for parts of our ELF parsing interface via cargo-fuzz
* Add SysVHashTable which interprets the contents of a SHT_HASH section
* Add StringTable::get_raw() to get an uninterpreted &[u8]
* Add ParsingTable.len() method to get the number of elements in the table
* Add some note n_type constants for GNU extension notes.
* Add default "to_str" feature to get &str for gabi constant names

Changed Interfaces:
* Change File::segments() to return a ParsingTable instead of just a ParsingIterator
* Change File's SectionHeader interfaces to provide a ParsingTable instead of just a ParsingIterator
* Remove deprecated File::section_data_for_header() in favor of File::section_data()
* Remove FileHeader wrapper types OSABI, Architecture, and ObjectFileType
* Remove ProgramHeader wrapper types ProgType and ProgFlag
* Remove Symbol wrapper types SymbolType SymbolBind SymbolVis
* Remove wrapper type SectionType
* Remove unhelpful SectionFlag wrapper type
* Remove Display impl for FileHeader, SectionHeader, ProgramHeader, Symbol
* Remove ParseError::UnsupportedElfVersion in favor of more general ParseError::UnsupportedVersion

Bug Fixes:
* Fix divide by zero panic when parsing a note with alignment of 0 (Error instead of panic)
* Use checked integer math all over the parsing code (Error instead of panic or overflow)
* Fix note parsing for 8-byte aligned .note.gnu.property sections (Successfully parse instead of Erroring)
* Add size validation when parsing tables with entsizes (Error instead of panic)
2022-11-01 16:54:08 -07:00
Christopher Cole
125681c359
Optimize github fuzz action
The action environment apparently already has gcc & g++ installed, so there's no need to waste time with the apt-get update
2022-11-01 13:05:39 -07:00
Christopher Cole
801c887a9e
Attempt at fixing fuzz github action tools install 2022-11-01 12:54:46 -07:00
Christopher Cole
5849b78f92
First attempt at a github fuzzing action
Not sure if this will work or not, and also not sure if there's a way to test a github action config without committing it... so... we'll do it live!
2022-11-01 12:45:29 -07:00
Christopher Cole
848f648996
Add some fuzz targets for some parts of our ELF parsing interface via cargo-fuzz
I decided to make multiple smaller fuzz targets like this in order to give each one
a smaller fuzzing domain to explore for that particular feature.
2022-11-01 12:14:15 -07:00
Christopher Cole
762f200231
Fix divide by zero panic when parsing a note with alignment of 0
This was caused by the alignment modulus calculation when fuzzing gave a note section header with zero-byte alignment
"attempt to calculate the remainder with a divisor of zero"
2022-11-01 11:48:39 -07:00
Christopher Cole
199caed7c2
Use checked integer math in GNU Symbol Versioning iterators 2022-11-01 11:08:54 -07:00
Christopher Cole
00986c6a05
Use checked integar math where appropriate in note.rs 2022-11-01 02:10:02 -07:00
Christopher Cole
07a567d3dd
Use checked integer math in hash.rs 2022-11-01 02:00:35 -07:00
Christopher Cole
d9d7cd4b79
Use checked integer math in File::symbol_version_table() 2022-11-01 01:13:00 -07:00
Christopher Cole
8e27fc920e
Dedupe a lot of start, size, end calculations with SectionHeader::get_data_range()
I kept this as pub(crate) for now, as I'm not sure if the naming is confusing for users or not,
so I want to keep it out of the public interface for now.
2022-11-01 01:06:06 -07:00
Christopher Cole
2e8365c9f7
Use checked integer math in File::dynamic_section() 2022-11-01 00:34:05 -07:00
Christopher Cole
47b687157d
Use checked integer math in File::get_symbol_table_of_type() 2022-11-01 00:31:23 -07:00
Christopher Cole
0564da4f97
Use checked integer math in ParsingTable::get() 2022-11-01 00:21:37 -07:00
Christopher Cole
eea7bbc757
Use checked integer math in the various File::section_data* methods 2022-11-01 00:18:50 -07:00
Christopher Cole
a2cf456530
Use checked integer math in File::section_data() 2022-11-01 00:05:10 -07:00
Christopher Cole
0421f9bfea
Use checked integer math in File::segments() 2022-11-01 00:02:12 -07:00
Christopher Cole
86416f4d0a
Use checked integer math in endian-aware integer parsing methods 2022-10-31 23:57:03 -07:00
Christopher Cole
138a5390ba
Use checked integer math in File::section_headers() 2022-10-31 23:50:25 -07:00
Christopher Cole
00e30b3c2f
Use checked integer math in File::section_headers_by_index() 2022-10-31 23:41:10 -07:00
Christopher Cole
411a9f066d
Use checked integer math in File::section_headers_with_strtab()
My plan is to use the correct width of the corresponding ELF types when parsing ELF structures
and then converting them into usize as appropriate when those fields are being interpreted
to locate other elf structures within in-memory buffers. For the common case these days
of 64-bit machines with 64-bit usizes, these conversions will all succeed. For 32-bit
machines, this conversion means that the library will not be able to parse large 64-bit
files. When that happens, though, the library should helpfully return an error instead of crashing.

These same sorts of changes will need to be made throughout the library in order to
harden it against crashing due to integer overflow math (often due to corrupted files).

Fuzzing catches this sort of thing really quickly.

Also, introduce two new resulting ParseError types:
  TryFromIntError and IntegerOverflow
2022-10-31 23:27:40 -07:00
Christopher Cole
e5c4892e71
Represent empty string table data with the empty slice instead of None
This matches our other lazy-parsing types
2022-10-31 20:55:31 -07:00
Christopher Cole
277058f448
Inline size constants for ParseAt types
These used to be used by tests, but now that's taken care of automatically
with ParseAt::size_for() in the test helpers.
2022-10-31 20:06:17 -07:00
Christopher Cole
af4b57b18c
Add some note n_type constants for GNU extension notes. 2022-10-31 19:22:08 -07:00
Christopher Cole
3713bb238f
Fix note parsing for 8-byte aligned .note.gnu.property sections
The current state seems to satisfy the cases I've observed so far. We'll see
how many variations of size/alignment exist out in the wild :P
2022-10-31 19:13:13 -07:00