Skip to content

Latest commit

 

History

History
266 lines (194 loc) · 7.56 KB

File metadata and controls

266 lines (194 loc) · 7.56 KB

Parsing command line arguments

A typical invocation of our CLI tool will look like this:

$ grrs foobar test.txt

We expect our program to look at test.txt and print out the lines that contain foobar. But how do we get these two values?

The text after the name of the program is often called the "command line arguments", or "command line flags" (especially when they look like --this). Internally, the operating system usually represents them as a list of strings – roughly speaking, they get separated by spaces.

There are many ways to think about these arguments, and how to parse them into something more easy to work with. You will also need to tell the users of your program which arguments they need to give and in which format they are expected.

Getting the arguments

The standard library contains the function std::env::args() that gives you an iterator of the given arguments. The first entry (at index 0) will be the name your program was called as (e.g. grrs), the ones that follow are what the user wrote afterwards.

Getting the raw arguments this way is quite easy (in file src/main.rs, after fn main() {):

{{#include cli-args-struct.rs:10:11}}

CLI arguments as data type

Instead of thinking about them as a bunch of text, it often pays off to think of CLI arguments as a custom data type that represents the inputs to your program.

Look at grrs foobar test.txt: There are two arguments, first the pattern (the string to look for), and then the path (the file to look in).

What more can we say about them? Well, for a start, both are required. We haven't talked about any default values, so we expect our users to always provide two values. Furthermore, we can say a bit about their types: The pattern is expected to be a string, while the second argument is expected to be a path to a file.

In Rust, it is common to structure programs around the data they handle, so this way of looking at CLI arguments fits very well. Let's start with this (in file src/main.rs, before fn main() {):

{{#include cli-args-struct.rs:3:7}}

This defines a new structure (a struct) that has two fields to store data in: pattern, and path.

Aside: PathBuf is like a String but for file system paths that work cross-platform.

Now, we still need to get the actual arguments our program got into this form. One option would be to manually parse the list of strings we get from the operating system and build the structure ourselves. It would look something like this:

{{#include cli-args-struct.rs:10:15}}

This works, but it's not very convenient. How would you deal with the requirement to support --pattern="foo" or --pattern "foo"? How would you implement --help?

Parsing CLI arguments with StructOpt

A much nicer way is to use one of the many available libraries. The most popular library for parsing command line arguments is called clap. It has all the functionality you'd expect, including support for sub-commands, shell completions, and great help messages.

The structopt library builds on clap and provides a "derive" macro to generate clap code for struct definitions. This is quite nice: All we have to do is annotate a struct and it’ll generate the code that parses the arguments into the fields.

Let's first import structopt by adding structopt = "0.3.13" to the [dependencies] section of our Cargo.toml file.

Now, we can write use structopt::StructOpt; in our code, and add #[derive(StructOpt)] right above our struct Cli. Let's also write some documentation comments along the way.

It’ll look like this (in file src/main.rs, before fn main() {):

{{#include cli-args-structopt.rs:3:14}}

Note: There are a lot of custom attributes you can add to fields. For example, we added one to tell structopt how to parse the PathBuf type. To say you want to use this field for the argument after -o or --output, you'd add #[structopt(short = "o", long = "output")]. For more information, see the structopt documentation.

Right below the Cli struct our template contains its main function. When the program starts, it will call this function. The first line is:

{{#include cli-args-structopt.rs:15:18}}

This will try to parse the arguments into our Cli struct.

But what if that fails? That's the beauty of this approach: Clap knows which fields to expect, and what their expected format is. It can automatically generate a nice --help message, as well as give some great errors to suggest you pass --output when you wrote --putput.

Note: The from_args method is meant to be used in your main function. When it fails, it will print out an error or help message and immediately exit the program. Don't use it in other places!

This is what it may look like

Running it without any arguments:

$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 10.16s
     Running `target/debug/grrs`
error: The following required arguments were not provided:
    <pattern>
    <path>

USAGE:
    grrs <pattern> <path>

For more information try --help

We can pass arguments when using cargo run directly by writing them after --:

$ cargo run -- some-pattern some-file
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s
     Running `target/debug/grrs some-pattern some-file`

As you can see, there is no output. Which is good: That just means there is no error and our program ended.

Exercise for the reader: Make this program output its arguments!

Parsing CLI arguments with 'Clap v3 beta' and dealing with stdin and stdout

Similar to StructOpt, it is also possible with Clap v3 (currently in beta) to define the command-line options as a struct and derive the rest.

The following code also illustrates how to read input either from file, if provided as a command-line argument, or from stdin otherwise. Same applies to output: If the output-file not given as command-line argument, the output is written per default to stdout.

{{#include cli-args-clap3-stdin-stdout.rs}}

Now the program could be used in four different ways

cargo run --bin cli-args-clap3-stdin-stdout -- < input.csv > output.csv
cargo run --bin cli-args-clap3-stdin-stdout --  input.csv > output.csv
cargo run --bin cli-args-clap3-stdin-stdout -- --output output.csv < input.csv  
cargo run --bin cli-args-clap3-stdin-stdout -- --output output.csv input.csv  

The help message looks like below:

cargo run --bin cli-args-clap3-stdin-stdout -- --help
CLAiR 0.1
A sample-program

The program is a show-case how to read data either from file or stdin and write data either to file
or stdout

USAGE:
    cli-args-clap3-stdin-stdout [OPTIONS] [input]

ARGS:
    <input>
            Optional input file; default stdin

FLAGS:
    -h, --help
            Prints help information

    -V, --version
            Prints version information


OPTIONS:
    -o, --output <output>
            Optional output file; default stdout