Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking PR for 3.0 #105

Merged
merged 30 commits into from
Aug 7, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
520da8e
feat: Allow ParseError to be used without the StreamOnce constraint
Marwes Mar 25, 2017
2f92b29
feat: Add map_token and map_range methods to ParseError
Marwes Jul 24, 2017
f50ab9e
fix: Make the positions of slice streams harder to misuse
Marwes Jul 26, 2017
7e69208
feat: Add the range_of parser
Marwes Jul 29, 2017
2b96afc
refactor: Split the position out of the StreamOnce trait
Marwes Jul 31, 2017
96da7ee
feat: Teach the choice parser to take tuples
Marwes Jul 31, 2017
900f6a4
chore: Remove use of deprecated functions
Marwes Jul 31, 2017
98ed5a3
Add {Range}Positioner and State 2.0 for parsing third-party tokens. F…
ncalexan Jan 13, 2017
59698a4
Update the new State type to work with 3.0
Marwes Jul 31, 2017
c085dec
Make the added positioner traits take the Item/Range by parameter
Marwes Jul 31, 2017
4cf8cff
fix: Add From<u8> for Info
Marwes Jul 31, 2017
ae43f8a
feat: Remove the old State type and Positioner trait
Marwes Jul 31, 2017
8e987d3
refactor: Remove unused types such as BytePosition
Marwes Jul 31, 2017
3add407
fix: Renamed SharedBufferedStream and BufferedStream to be less confu…
Marwes Aug 1, 2017
4f4b4aa
test
Marwes Jul 11, 2017
54fecc6
fix: Add the correct errors after sequencing has returned EmptyOk
Marwes Jul 22, 2017
7e27c52
fix: Don't forward tuple parsers to frunk to prevent a performance loss
Marwes Jul 25, 2017
4d6ff99
Remove the frunk dependency
Marwes Jul 26, 2017
626011b
docs: Explain TrackedError a bit
Marwes Aug 2, 2017
1a49464
chore: Don't build on older travis versions
Marwes Aug 2, 2017
9107342
fix: Remove depreceated items
Marwes Aug 2, 2017
3fbcad4
Handle EmptyOk -> EmptyErr when composing sequencing and alternating …
Marwes Aug 4, 2017
3cb29bc
fix compilation of json.rs
Marwes Aug 4, 2017
81cffe7
Test implementing choice by specializing with macros instead of forwa…
Marwes Jul 31, 2017
357e162
refactor: Rename TrackedError to Tracked
Marwes Aug 6, 2017
57b58c9
refactor: Rename range_of to recognize
Marwes Aug 6, 2017
f9c1a2b
Pick up that word!
Marwes Aug 7, 2017
17d4c16
chore: Version 3.0.0-alpha.1
Marwes Aug 7, 2017
e1a2c80
docs: Add an "upgrade guide" for 3.0
Marwes Aug 7, 2017
3675d33
Fix typos
Marwes Aug 7, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ rust:
- nightly
- beta
- stable
- 1.17.0
before_script:
- |
pip install 'travis-cargo<0.2' --user &&
Expand Down
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,36 @@
<a name="v3.0.0-alpha.1"></a>
## v3.0.0-alpha.1 (2017-08-07)


#### Features

* Remove the old State type and Positioner trait ([ae43f8ae](https://github.com/Marwes/combine/commit/ae43f8ae2b303aca3b5ae9fbb1a87475349f2745), breaks [#](https://github.com/Marwes/combine/issues/))
* Teach the choice parser to take tuples ([96da7ee0](https://github.com/Marwes/combine/commit/96da7ee0cf8a112e60747a0be8a4dbd90efbecba), breaks [#](https://github.com/Marwes/combine/issues/))
* Add the range_of parser ([7e692086](https://github.com/Marwes/combine/commit/7e69208650f7fdc75279370b193030b09ccdbc7a), closes [#83](https://github.com/Marwes/combine/issues/83), breaks [#](https://github.com/Marwes/combine/issues/))
* Add map_token and map_range methods to ParseError ([2f92b296](https://github.com/Marwes/combine/commit/2f92b29669b618535bcd7533b7dd39b7daa8579b), closes [#86](https://github.com/Marwes/combine/issues/86))
* Allow ParseError to be used without the StreamOnce constraint ([520da8e8](https://github.com/Marwes/combine/commit/520da8e89f7162b4d6ba3a3bca05a05f3bd37999), breaks [#](https://github.com/Marwes/combine/issues/))

#### Bug Fixes

* Remove depreceated items ([9107342a](https://github.com/Marwes/combine/commit/9107342a89a5efc664bac9c2919a93a992ca6809), breaks [#](https://github.com/Marwes/combine/issues/))
* Don't forward tuple parsers to frunk to prevent a performance loss ([7e27c523](https://github.com/Marwes/combine/commit/7e27c523da46828b254ee4fc7c1f9750623e5aff))
* Add the correct errors after sequencing has returned EmptyOk ([54fecc62](https://github.com/Marwes/combine/commit/54fecc62938445aae15373a6b1ec7c4419582025), closes [#95](https://github.com/Marwes/combine/issues/95))
* Renamed SharedBufferedStream and BufferedStream to be less confusing ([3add407e](https://github.com/Marwes/combine/commit/3add407eecf886cc72ce05414d58a2b3b19a0bb9), breaks [#](https://github.com/Marwes/combine/issues/))
* Add From<u8> for Info ([4cf8cff6](https://github.com/Marwes/combine/commit/4cf8cff64466519bf2d4a4dc1dcbe8deb449e004))
* Make the positions of slice streams harder to misuse ([f50ab9e2](https://github.com/Marwes/combine/commit/f50ab9e2f42ec2465368bfb11a60b2339b699fc4), closes [#104](https://github.com/Marwes/combine/issues/104), breaks [#](https://github.com/Marwes/combine/issues/))

#### Breaking Changes

* Remove depreceated items ([9107342a](https://github.com/Marwes/combine/commit/9107342a89a5efc664bac9c2919a93a992ca6809), breaks [#](https://github.com/Marwes/combine/issues/))
* Renamed SharedBufferedStream and BufferedStream to be less confusing ([3add407e](https://github.com/Marwes/combine/commit/3add407eecf886cc72ce05414d58a2b3b19a0bb9), breaks [#](https://github.com/Marwes/combine/issues/))
* Remove the old State type and Positioner trait ([ae43f8ae](https://github.com/Marwes/combine/commit/ae43f8ae2b303aca3b5ae9fbb1a87475349f2745), breaks [#](https://github.com/Marwes/combine/issues/))
* Teach the choice parser to take tuples ([96da7ee0](https://github.com/Marwes/combine/commit/96da7ee0cf8a112e60747a0be8a4dbd90efbecba), breaks [#](https://github.com/Marwes/combine/issues/))
* Add the range_of parser ([7e692086](https://github.com/Marwes/combine/commit/7e69208650f7fdc75279370b193030b09ccdbc7a), closes [#83](https://github.com/Marwes/combine/issues/83), breaks [#](https://github.com/Marwes/combine/issues/))
* Make the positions of slice streams harder to misuse ([f50ab9e2](https://github.com/Marwes/combine/commit/f50ab9e2f42ec2465368bfb11a60b2339b699fc4), closes [#104](https://github.com/Marwes/combine/issues/104), breaks [#](https://github.com/Marwes/combine/issues/))
* Allow ParseError to be used without the StreamOnce constraint ([520da8e8](https://github.com/Marwes/combine/commit/520da8e89f7162b4d6ba3a3bca05a05f3bd37999), breaks [#](https://github.com/Marwes/combine/issues/))



<a name="v2.5.0"></a>
## v2.5.0 (2017-08-07)

Expand Down
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]

name = "combine"
version = "2.5.0"
version = "3.0.0-alpha.1"
authors = ["Markus Westerlind <marwes91@gmail.com>"]

description = "Fast parser combinators on arbitrary streams with zero-copy support."
Expand Down Expand Up @@ -49,4 +49,4 @@ harness = false
required-features = ["mp4"]

[package.metadata.docs.rs]
features = ["doc"]
features = ["doc"]
31 changes: 25 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
# combine
[![Build Status](https://travis-ci.org/Marwes/combine.svg?branch=master)](https://travis-ci.org/Marwes/combine) [![Docs v1](https://docs.rs/combine/badge.svg?version=^1)](https://docs.rs/combine/^1) [![Docs](https://docs.rs/combine/badge.svg)](https://docs.rs/combine) [![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/Marwes/combine?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
[![Build Status](https://travis-ci.org/Marwes/combine.svg?branch=master)](https://travis-ci.org/Marwes/combine) [![Docs v2](https://docs.rs/combine/badge.svg?version=^1)](https://docs.rs/combine/^2) [![Docs](https://docs.rs/combine/badge.svg)](https://docs.rs/combine) [![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/Marwes/combine?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)

An implementation of parser combinators for Rust, inspired by the Haskell library [Parsec](https://hackage.haskell.org/package/parsec). As in Parsec the parsers are [LL(1)](https://en.wikipedia.org/wiki/LL_parser) by default but they can opt-in to arbitrary lookahead using the [try combinator](https://marwes.github.io/combine/combine/fn.try.html).

## Example

```rust
extern crate combine;
use combine::{many, Parser};
use combine::char::letter;
use combine::{many1, Parser, sep_by};
use combine::char::{letter, space};

let result = many(letter()).parse("hello world");
assert_eq!(result, Ok(("hello".to_string(), " world")));
let word = many1(letter());

let mut parser = sep_by(word, space())
.map(|mut words: Vec<String>| words.pop());
let result = parser.parse("Pick up that word!");
assert_eq!(result, Ok((Some("word".to_string()), "!")));
```

Larger examples can be found in the [tests][tests] and [benches][benches] folders.
Expand All @@ -37,7 +41,7 @@ If you end up trying it I welcome any feedback from your experience with it. I a

### Why does my errors contain inscrutable positions?

Since `combine` aims to crate parsers with little to no overhead streams over `&str` and `&[T]` do not carry any extra position information but instead only rely on comparing the pointer of the buffer to check which `Stream` is further ahead than another `Stream`. To retrieve a better position, either call `translate_position` on the `ParseError` or wrap your stream with `State`.
Since `combine` aims to crate parsers with little to no overhead streams over `&str` and `&[T]` do not carry any extra position information but instead only rely on comparing the pointer of the buffer to check which `Stream` is further ahead than another `Stream`. To retrieve a better position, either call `translate_position` on the `PointerOffset` which represents the position or wrap your stream with `State`.

## Extra

Expand All @@ -47,6 +51,9 @@ You can find older versions of combine (parser-combinators) [here](https://crate

## Contributing

Current master is the 3.0.0 branch. If you want to submit a fix or feature to the 2.x version of combine then
do so to the 2.x branch or submit the PR to master and request that it be backported.

The easiest way to contribute is to just open an issue about any problems you encounter using combine but if you are interested in adding something to the library here is a list of some of the easier things to work on to get started.

* __Add additional parsers__ There is a list of parsers which aren't implemented [here][add parsers] but if you have a suggestion for another parser just leave a suggestion on the issue itself.
Expand All @@ -59,6 +66,18 @@ The easiest way to contribute is to just open an issue about any problems you en

Here is a list containing most of the breaking changes in older versions of combine (parser-combinators).

### 3.0.0-alpha.1

* Deprecated items have been changed or removed. Upgrade to the latest version of 2.x first and fix all
deprecations before upgrading to 3.x.
* If you have written the `ParseError<I>` explicitly it needs to be changed to `StreamError<I>` as
`ParseError`s type signature have changed slightly. Function calls should not be affected however.
* Parsers now return `Tracked<StreamError<I>>` instead of plain `ParseError<I>`. `Tracked` is an internal
wrapper which should just be constructed via `From::from` or `Into::into`. If you return errors explicitly
somewhere you will need to add `.into()` on the errors to wrap them.
* A few other changes should be detected and fixed easily by simply compiling and fixing the compile errors.
See [CHANGELOG.md](https://github.com/Marwes/combine/blob/master/CHANGELOG.md) for a complete list of breaking changes.

### 2.0.0-beta3

* `parse_state` renamed to `parse_stream`.
Expand Down
2 changes: 1 addition & 1 deletion benches/http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ fn is_http_version(c: u8) -> bool {
c >= b'0' && c <= b'9' || c == b'.'
}

fn parse_http_request(input: &[u8]) -> Result<((Request, Vec<Header>), &[u8]), ParseError<&[u8]>> {
fn parse_http_request(input: &[u8]) -> Result<((Request, Vec<Header>), &[u8]), StreamError<&[u8]>> {
// Making a closure, because parser instances cannot be reused
let end_of_line = || (token(b'\r'), token(b'\n')).map(|_| b'\r').or(token(b'\n'));

Expand Down
39 changes: 21 additions & 18 deletions benches/json.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ use std::path::Path;
use bencher::{black_box, Bencher};

use pc::primitives::{BufferedStream, Consumed, IteratorStream, ParseError, ParseResult, Parser,
State, Stream};
Stream};
use pc::char::{char, digit, spaces, string, Spaces};
use pc::combinator::{any, between, choice, many, optional, parser, satisfy, sep_by, Expected,
FnParser, Skip, many1};
use pc::char::{char, digit, spaces, string, Spaces};
use pc::state::{SourcePosition, State};

#[derive(PartialEq, Debug)]
enum Value {
Expand Down Expand Up @@ -119,10 +120,9 @@ where
});
match c {
'\\' => input.combine(|input| back_slash_char.parse_stream(input)),
'"' => Err(Consumed::Empty(ParseError::from_errors(
input.into_inner().position(),
Vec::new(),
))),
'"' => Err(Consumed::Empty(
ParseError::from_errors(input.into_inner().position(), Vec::new()).into(),
)),
_ => Ok((c, input)),
}
}
Expand Down Expand Up @@ -152,21 +152,21 @@ where
}
#[allow(unconditional_recursion)]
fn value_(input: I) -> ParseResult<Value, I> {
let mut array = between(
let array = between(
lex(char('[')),
lex(char(']')),
sep_by(Json::<I>::value(), lex(char(','))),
).map(Value::Array);

choice::<[&mut Parser<Input = I, Output = Value>; 7], _>([
&mut Json::<I>::string().map(Value::String),
&mut Json::<I>::object(),
&mut array,
&mut Json::<I>::number().map(Value::Number),
&mut lex(string("false").map(|_| Value::Bool(false))),
&mut lex(string("true").map(|_| Value::Bool(true))),
&mut lex(string("null").map(|_| Value::Null)),
]).parse_lazy(input)
choice((
Json::<I>::string().map(Value::String),
Json::<I>::object(),
array,
Json::<I>::number().map(Value::Number),
lex(string("false").map(|_| Value::Bool(false))),
lex(string("true").map(|_| Value::Bool(true))),
lex(string("null").map(|_| Value::Null)),
)).parse_lazy(input)
.into()
}
}
Expand Down Expand Up @@ -238,9 +238,12 @@ fn bench_buffered_json(bencher: &mut Bencher) {
.and_then(|mut file| file.read_to_string(&mut data))
.unwrap();
bencher.iter(|| {
let buffer = BufferedStream::new(IteratorStream::new(data.chars()), 1);
let buffer = BufferedStream::new(State::new(IteratorStream::new(data.chars())), 1);
let mut parser = Json::value();
match parser.parse(State::new(buffer.as_stream())) {
match parser.parse(State::with_positioner(
buffer.as_stream(),
SourcePosition::default(),
)) {
Ok((Value::Array(v), _)) => {
black_box(v);
}
Expand Down
4 changes: 2 additions & 2 deletions benches/mp4.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ enum MP4Box<'a> {
Unknown,
}

fn parse_mp4(data: &[u8]) -> Result<(Vec<MP4Box>, &[u8]), ParseError<&[u8]>> {
fn parse_mp4(data: &[u8]) -> Result<(Vec<MP4Box>, &[u8]), StreamError<&[u8]>> {
let brand_name = || take(4).and_then(from_utf8);
let filetype_box = (
range(&b"ftyp"[..]),
Expand Down Expand Up @@ -67,7 +67,7 @@ fn parse_mp4(data: &[u8]) -> Result<(Vec<MP4Box>, &[u8]), ParseError<&[u8]>> {
fn run_test(b: &mut Bencher, data: &[u8]) {
b.iter(|| match parse_mp4(data) {
Ok(x) => black_box(x),
Err(err) => panic!("{:?}", err),
Err(err) => panic!("{}", err.map_range(|bytes| format!("{:?}", bytes))),
});
}

Expand Down
17 changes: 9 additions & 8 deletions src/byte.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@ use std::marker::PhantomData;
use self::ascii::AsciiChar;

use combinator::{satisfy, skip_many, token, tokens, Expected, Satisfy, SkipMany, Token, With};
use primitives::{ConsumedResult, Info, ParseError, Parser, RangeStream, Stream};
use range::take;
use primitives::{ConsumedResult, Info, Parser, Stream, StreamError, Tracked};

/// Parses a byteacter and succeeds if the byteacter is equal to `c`.
///
Expand Down Expand Up @@ -281,8 +280,8 @@ where
.parse_lazy(input)
.map(|bytes| bytes.as_slice())
}
fn add_error(&mut self, errors: &mut ParseError<Self::Input>) {
tokens(|&l, r| l == r, Info::Range(self.0), self.0.iter()).add_error(errors)
fn add_error(&mut self, errors: &mut Tracked<StreamError<Self::Input>>) {
tokens::<_, _, I>(|&l, r| l == r, Info::Range(self.0), self.0.iter()).add_error(errors)
}
}

Expand Down Expand Up @@ -330,9 +329,9 @@ where
let cmp = &mut self.1;
tokens(|&l, r| cmp(l, r), Info::Range(self.0), self.0).parse_lazy(input)
}
fn add_error(&mut self, errors: &mut ParseError<Self::Input>) {
fn add_error(&mut self, errors: &mut Tracked<StreamError<Self::Input>>) {
let cmp = &mut self.1;
tokens(|&l, r| cmp(l, r), Info::Range(self.0), self.0.iter()).add_error(errors)
tokens::<_, _, I>(|&l, r| cmp(l, r), Info::Range(self.0), self.0.iter()).add_error(errors)
}
}

Expand Down Expand Up @@ -368,6 +367,8 @@ where
/// Parsers for decoding numbers in big-endian or little-endian order.
pub mod num {
use super::*;
use primitives::RangeStream;
use range::take;

use byteorder::{ByteOrder, BE, LE};

Expand All @@ -392,8 +393,8 @@ pub mod num {
fn parse_lazy(&mut self, input: Self::Input) -> ConsumedResult<Self::Output, Self::Input> {
take(::std::mem::size_of::<Self::Output>()).map(B::$read_name).parse_lazy(input)
}
fn add_error(&mut self, errors: &mut ParseError<Self::Input>) {
take(::std::mem::size_of::<Self::Output>()).add_error(errors)
fn add_error(&mut self, errors: &mut Tracked<StreamError<Self::Input>>) {
take::<I>(::std::mem::size_of::<Self::Output>()).add_error(errors)
}
}

Expand Down
13 changes: 7 additions & 6 deletions src/char.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use primitives::{ConsumedResult, ParseError, Parser, Stream};
use primitives::{ConsumedResult, Parser, Stream, StreamError, Tracked};
use combinator::{satisfy, skip_many, token, tokens, Expected, Satisfy, SkipMany, Token, With};
use std::marker::PhantomData;

Expand Down Expand Up @@ -308,8 +308,8 @@ where
.parse_lazy(input)
.map(|_| self.0)
}
fn add_error(&mut self, errors: &mut ParseError<Self::Input>) {
tokens(eq, self.0.into(), self.0.chars()).add_error(errors)
fn add_error(&mut self, errors: &mut Tracked<StreamError<Self::Input>>) {
tokens::<_, _, I>(eq, self.0.into(), self.0.chars()).add_error(errors)
}
}

Expand Down Expand Up @@ -351,8 +351,8 @@ where
.parse_lazy(input)
.map(|_| self.0)
}
fn add_error(&mut self, errors: &mut ParseError<Self::Input>) {
tokens(&mut self.1, self.0.into(), self.0.chars()).add_error(errors)
fn add_error(&mut self, errors: &mut Tracked<StreamError<Self::Input>>) {
tokens::<_, _, I>(&mut self.1, self.0.into(), self.0.chars()).add_error(errors)
}
}

Expand Down Expand Up @@ -382,7 +382,8 @@ where
#[cfg(test)]
mod tests {
use super::*;
use primitives::{Error, ParseError, Parser, SourcePosition, State};
use primitives::{Error, ParseError, Parser};
use state::{SourcePosition, State};

#[test]
fn space_error() {
Expand Down
Loading