Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research spike for creating a new "dict" type #3158

Closed
rockstar opened this issue Aug 28, 2020 · 8 comments
Closed

Research spike for creating a new "dict" type #3158

rockstar opened this issue Aug 28, 2020 · 8 comments
Assignees

Comments

@rockstar
Copy link
Contributor

rockstar commented Aug 28, 2020


fromList({key: ) => dict
get(dict, key) => value
insert(dict, key, value) => dict
delete(dict, key) => dict
  • Arrays are an example of a parameterized type, can be used as an example.
  • dicts are immutable, so mutations will return a new dict
    • Performant algorithms for how this would be implemented on the go side should be researched. How do other languages implement immutable key/value pairs?
    • Does insert and delete get implemented as a first pass? There must be a path forward planned, even if not fully implemented.
  • Syntax
    • What does the syntax look like for the type of a dict?
    • What does the syntax look like to create a dict? Do we want syntax at all, or do we just use create? Could the creation syntactic sugar look like this: [a: b, c: d]
    • What does the syntax look like to access keys? Do we want syntax?
    • Syntactic sugar isn't strictly required for a first pass, though there is a "fun factor" to the sugar there.
  • Do we have createFromList and createFromRecord?
    • createFromRecord would require some reflection, so probably not.

Nice to haves, but not required:

  • Can we find a way to disambiguate between list, record, and dict in something like a = 1 obj[a]? What is the impact on the type system, and is it even possible?

DOD

  • Update SPEC for new specification of the Flux dict type
  • Set of issues added to epic to implement the specification
@rockstar rockstar added this to the Sprint 20-Q3-6 milestone Aug 28, 2020
@nathanielc nathanielc removed this from the Sprint 20-Q4-1 milestone Oct 5, 2020
@jpacik jpacik self-assigned this Nov 2, 2020
@jpacik
Copy link
Contributor

jpacik commented Nov 6, 2020

Basic API

The basic API of the dictionary type is the following

builtin fromList : (pairs: [{ key: K, value: V }]) => |K: V| where K: Comparable
builtin get : (key: K, from: |K: V|, default: V) => V where K: Comparable
builtin insert : (key: K, value: V, dict: |K: V|) => |K: V| where K: Comparable
builtin remove : (key: K, dict: |K: V|) => |K: V| where K: Comparable

At the very least we need a function that instantiates a dictionary from a list of pairs. We need a way to lookup an element from a dict. And we also need a way to modify dictionaries by inserting and removing elements.

Note this basic API does not require any new syntax. This means no changes to the parser/AST. We do however need to define a new parameterized type for dictionaries |K: V|. Again because the basic API doesn't require any syntax changes to the language, this means type inference doesn't need to generate any special type constraints. All we need is to define the unification rules for the new dict type. There's only one rule in this case

|k0: v0| = |k1: v1| => k0 = v0, k1 = v1

This means the type system will catch errors such as the following

d = dict.fromList(pairs: [{key: [], value: []}]) // error: array types not comparable
d = dict.fromList(pairs: [{key: "a", value: 0}])
dict.get(key: "a", from: d, default: -1)
dict.get(key: 1, from: d, default: -1) // error: string != int

Implementation

Immutable maps are typically implemented using some form of balanced/unbalanced tree data structure. This keeps updates/inserts/deletes from having to do a full copy of the tree. Instead these mutations only require copying the search path. The rest of the tree can be shared.

Ben Johnson has actually implemented a sorted immutable map type for which the underlying representation is a B+Tree https://github.com/benbjohnson/immutable#sorted-map. It's not clear just yet if this is the right implementation to use as the internal node size is 32 elements which means the size of a dictionary will need to be relatively large in order for operations to be efficient. In particular, modifying dictionaries fewer than 32 elements in size will involve a full deep copy. If this is too much overhead then we'll probably have to implement a persistent map ourselves.

Regardless of the underlying implementation however, we'll need to create a new value type similar to the following

type Dictionary interface {
    Get(key int64, default: Value) Value
    Insert(key int64, val: Value) Dictionary
    Remove(key int64) Dictionary

    Get(key string, default: Value) Value
    Insert(key string, val: Value) Dictionary
    Remove(key string) Dictionary

    ...
}

Implementing dict.get will then look like

func (ctx context.Context, args values.Object) (values.Value, error) {
	key, ok := args.Get("key")
	if !ok {
                ...
	}
	from, ok := args.Get("dict")
	if !ok {
		...
	}
        def, ok := args.Get("default")
        if !ok {
                ...
        }
	switch k.Type().Nature() {
	case Int:
		return d.Get(k.Int(), def), nil
	case String:
		return d.Get(k.String(), def), nil
        ...
	}
}

Syntax for dictionary literals

dictionary = | "a": 0, "b": 1, "c": 2 |

dict.fromList works perfectly fine for instantiating a dictionary but it's a little clunky. It would be nice to have simple syntax for dictionary literals. This would require updates to the parser, type checker, and interpreter.

Type inference will need to generate type constraints for this new syntax. For example, for the following literal

| "a": 0, "b": 1 |

we need to generate the following constraints

type_of(| "a": 0, "b": 1 |) = |k0: v0| where k0: Comparable
k0 = type_of("a")
k0 = type_of("b")
v0 = type_of(0)
v1 = type_of(1)

There is however one edge case we need to handle and that is the empty dictionary literal []. First the parser thinks this is an array literal and as a result so does type inference. This means the following flux will fail type checking

dict.insert(key: "a", value: 1, dict: [])

In order to handle this case we need to define a new type constraint Indexable where array types and dict types are both Indexable. Now instead of [] being constrained as an array type, it will be constrained as Indexable initially and in the example above, the concrete dict type will be inferred when type checking the function call.

We don't need to add any special type constraints since there's no ambiguity syntax-wise among dictionaries, records, or arrays.

Syntax for accessing dictionaries

I really like the syntax dict["a"] for reading dictionaries. However that syntax is actually equivalent to dict.a. That is we see dict as a record type with label a. Unfortunately we can't overcome this ambiguity at the moment. So while using dict.get is definitely more verbose, I'd rather use it instead of adding new syntax to the language which will most likely be unfamiliar to most users.

@nathanielc
Copy link
Contributor

Two questions:

On the sytnax for the literals would we support an identifier as the key? i.e. something like this [a: 1, b: 2]?

For consistency should we call all the dictionary args dict? The only one that is not is the get funciton:

builtin get(key: K, from: [K: V], default: V) -> V where K: Comparable

Should it be instead?

builtin get(key: K, dict: [K: V], default: V) -> V where K: Comparable

I like how dict.get(key:"a", from: d, default: 1) reads but it is the only one that is different which makes me think it will be hard to remember as a user.

@jpacik
Copy link
Contributor

jpacik commented Nov 10, 2020

@nathanielc

On the sytnax for the literals would we support an identifier as the key? i.e. something like this [a: 1, b: 2]?

I don't think so since it's not consistent with the type.

For consistency should we call all the dictionary args dict?

Yes I think you're right.

@nathanielc
Copy link
Contributor

Makes sense follow up question:

On the sytnax for the literals would we support an identifier as the key? i.e. something like this [a: 1, b: 2]?

I don't think so since it's not consistent with the type.

Would we let literals have a dynamic key?

a = "foo"
d = [a: 1, "baz": 2]
d == ["foo": 1, "baz": 2] // true

@jpacik
Copy link
Contributor

jpacik commented Nov 10, 2020

Ah it seems I misunderstood your original question. Yes any expression that evaluates to (in this case) a string should be valid to use as a key value in a dict literal.

@rockstar
Copy link
Contributor Author

This looks pretty straightforward, though it's clearly a bigger bite than I originally expected this to have. I was reading it thinking "could we create a kind of 'limited' dict? One that only has strings as keys?" Looking back at my notes, I saw that we did discuss this, and I just forgot.

Would we let literals have a dynamic key?

This is the place where I feel kind of awkward. I think this is the right thing, but it could create some confusion with people working in other languages (javascript, for instance, has a real and actual adjacent problem). I guess the only thing I'm asking is that we make sure the missing identifier error here be helpful in explaining what happened here when one gets it wrong, but even saying that out loud, we could do to have friendlier error messages in a number of places.

@nathanielc
Copy link
Contributor

@rockstar That is a great point, we want to have a super clear error message for a missing identifier in this context that recommends the user use quotes.

For example something like this (using the new syntax proposed in the PR #3317)

hostLookup = |
    d35qd9: "host.example.com",
   //...
|
type error @... identifier "d35qd9" is not defined, use quotes for dictionary keys, for example | "d35qd9": "host.example.com" |

Or something like that. On a related note I found this a while back that is a Rust crate for providing helpful error messages with suggestion like this https://github.com/brendanzab/codespan

@rockstar
Copy link
Contributor Author

a Rust crate for providing helpful error messages with suggestion like this

OF COURSE there's a rust crate for nice error messages. 😂 I don't think I'd ever thought much about error messages as much as I have since writing rust regularly.

@jpacik jpacik closed this as completed Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants