hn-classics/_stories/2010/11056704.md

55 KiB
Raw Blame History

created_at title url author points story_text comment_text num_comments story_id story_title story_url parent_id created_at_i _tags objectID
2016-02-08T06:53:40.000Z What's new in purely functional data structures since Okasaki? (2010) http://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki r4um 154 42 1454914420
story
author_r4um
story_11056704
11056704

Source

reference request - What's new in purely functional data structures since Okasaki? - Theoretical Computer Science Stack Exchange

Stack Exchange Network

Stack Exchange network consists of 172 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange

current community

[

Theoretical Computer Science ]4

  • [

Theoretical Computer Science Meta ]7

your communities

Sign up or log in to customize your list.

more stack exchange communities

company blog

  1. 15
  2. 16
  3. 17
  4. 18

Log In Sign Up

Theoretical Computer Science

_

Theoretical Computer Science Stack Exchange is a question and answer site for theoretical computer scientists and researchers in related fields. Join them; it only takes a minute:

Sign up

Here's how it works:

  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

What's new in purely functional data structures since Okasaki?

| ----- | |

up vote 528 down vote favorite

689

|

Since Chris Okasaki's 1998 book "Purely functional data structures", I haven't seen too many new exciting purely functional data structures appear; I can name just a few:

  • IntMap (also invented by Okasaki in 1998, but not present in that book)
  • Finger trees (and their generalization over monoids)

There are also some interesting ways of implementing already known datastructures, such as using "nested types" or "generalized algebraic datatypes" to ensure tree invariants.

Which other new ideas have appeared since 1998 in this area?

reference-request big-list ds.data-structures functional-programming

|

share|cite|improve this question

|

asked Sep 21 '10 at 8:04

jkff

4,19431631

|

|
| |

  • 18

Nice question. I just had a student asking me about this, and didn't know the answer. – Suresh Venkat♦ Sep 21 '10 at 8:07

  • This is OK for here, but you might get better answers on Stack Overflow. If you ask there, be sure and link to the discussion here. – Charles Stewart Sep 21 '10 at 13:01
  • 3

Well the Haskell Reddit has seen this so there will be some good answers coming in from there too but excellent question. Just being halfway through Okasaki's book I was wondering the same think myself. +1 – Robert Massaioli Sep 21 '10 at 22:52

  • 8

This inspired me to ask a related question. – jbapple Sep 22 '10 at 18:44

  • 3

Okasaki's blog post from 2008 for the 10th anniversary of the book – Kaveh Jan 4 '16 at 12:33

 |  show 1 more comment

|

6 Answers 6

active oldest votes

| ----- | |

up vote 528 down vote accepted

|

New purely functional data structures published since 1998:

Known in 1997, but not discussed in Okasaki's book:

  • Many other styles of balanced search tree. AVL, brother, rank-balanced, bounded-balance, and many other balanced search trees can be (and have been) implemented purely functionally by path copying. Perhaps deserving special mention are:

    • **Biased Search Trees, by Samuel W. Bent, Daniel D. Sleator, and Robert E. Tarjan**: A key element in Brodal et al.'s 2006 paper and Demaine et al.'s 2008 paper.
  • **Infinite sets that admit fast exhaustive search, by Martín Escardó**: Perhaps not a data structure per se.

  • **Three algorithms on Braun Trees, by Chris Okasaki**: Braun trees offer many stack operations in worst-case O(lg n). This bound is surpassed by many other data structures, but Braun trees have a cons operation lazy in its second argument, and so can be used as infinite stacks in some ways that other structures cannot.

  • **The relaxed min-max heap: A mergeable double-ended priority queue and The KD heap: An efficient multi-dimensional priority queue, by Yuzheng Ding and Mark Allen Weiss**: These happen to be purely functional, though this is not discussed in the papers. I do not think the time bounds achieved are any better than those that can be achieved by using finger trees (of Hinze & Paterson or Kaplan & Tarjan) as k-dimensional priority queues, but I think the structures of Ding & Weiss uses less space.

  • **The Zipper, by Gérard Huet**: Used in many other data structures (such as Hinze & Paterson's finger trees), this is a way of turning a data structure inside-out.

  • Difference lists are O(1) catenable lists with an O(n) transformation to usual cons lists. They have apparently been known since antiquity in the Prolog community, where they have an O(1) transformation to usual cons lists. The O(1) transformation seems to be impossible in traditional functional programming, but Minamide's hole abstraction, from POPL '98, discusses a way of allowing O(1) append and O(1) transformation within pure functional programming. Unlike the usual functional programming implementations of difference lists, which are based on function closures, hole abstractions are essentially the same (in both their use and their implementation) as Prolog difference lists. However, it seems that for years the only person that noticed this was one of Minamide's reviewers.

  • Uniquely represented dictionaries support insert, update, and lookup with the restriction that no two structures holding the same elements can have distinct shapes. To give an example, sorted singly-linked lists are uniquely represented, but traditional AVL trees are not. Tries are also uniquely represented. Tarjan and Sundar, in "Unique binary search tree representations and equality-testing of sets and sequences", showed a purely functional uniquely represented dictionary that supports searches in logarithmic time and updates in O(sqrt{n}) time. However, it uses Theta(n lg n) space. There is a simple representation using Braun trees that uses only linear space but has update time of Theta(sqrt{n lg n}) and search time of Theta(lg^2 n)

Mostly functional data structures, before, during, and after Okasaki's book:

  • Many procedures for making data structures persistent, fully persistent, or confluently persistent: Haim Kaplan wrote an excellent survey on the topic. See also above the work of Demaine et al., who demonstrate a fully persistent array in O(m) space (where m is the number of operations ever performed on the array) and O(lg lg n) expected access time.
  • 1989: **Randomized Search Trees by Cecilia R. Aragon and Raimund Seidel**: These were discussed in a purely functional setting by Guy E. Blelloch and Margaret Reid-Miller in Fast Set Operations Using Treaps and by Dan Blandford and Guy Blelloch in Functional Set Operations with Treaps (code). They provide all of the operations of purely functional fingertrees and biased search trees, but require a source of randomness, making them not purely functional. This may also invalidate the time complexity of the operations on treaps, assuming an adversary who can time operations and repeat the long ones. (This is the same reason why imperative amortization arguments aren't valid in a persistent setting, but it requires an adversary with a stopwatch)
  • 1997: **Skip-trees, an alternative data structure to Skip-lists in a concurrent approach, by Xavier Messeguer and Exploring the Duality Between Skip Lists and Binary Search Trees, by Brian C. Dean and Zachary H. Jones**: Skip lists are not purely functional, but they can be implemented functionally as trees. Like treaps, they require a source of random bits. (It is possible to make skip lists deterministic, but, after translating them to a tree, I think they are just another way of looking at 2-3 trees.)
  • 1998: All of the amortized structures in Okasaki's book! Okasaki invented this new method for mixing amortization and functional data structures, which were previously thought to be incompatible. It depends upon memoization, which, as Kaplan and Tarjan have sometimes mentioned, is actually a side effect. In some cases (such as PFDS on SSDs for performance reasons), this may be inappropriate.
  • 1998: **Simple Confluently Persistent Catenable Lists, by Haim Kaplan, Chris Okasaki, and Robert E. Tarjan**: Uses modification under the hood to give amortized O(1) catenable deques, presenting the same interface as an earlier (purely functional, but with memoization) version appearing in Okasaki's book. Kaplan and Tarjan had earlier created a purely functional O(1) worst-case structure, but it is substantially more complicated.
  • 2007: As mentioned in another answer on this page, semi-persistent data structures and persistent union-find by Sylvain Conchon and Jean-Christophe Filliâtre

Techniques for verifying functional data structures, before, during, and after Okasaki's book:

  • The Soft Heap: An Approximate Priority Queue with Optimal Error Rate, by Bernard Chazelle: This data structure does not use arrays, and so has tempted first the #haskell IRC channel and later Stack Overflow users, but it includes delete in o(lg n), which is usually not possible in a functional setting, and imperative amortized analysis, which is not valid in a purely functional setting.
  • Balanced binary search trees with O(1) finger updates. In Making Data Structures Persistent, James R Driscoll, Neil Sarnak, Daniel D. Sleator, and Robert E. Tarjan present a method for grouping the nodes in a red-black tree so that persistent updates require only O(1) space. The purely functional deques and finger trees designed by Tarjan, Kaplan, and Mihaescu all use a very similar grouping technique to allow O(1) updates at both ends. AVL-trees for localized search by Athanasios K. Tsakalidis works similarly.
  • Faster pairing heaps or better bounds for pairing heaps: Since Okasaki's book was published, several new analyses of imperative pairing heaps have appeared, including Pairing heaps with O(log log n) decrease Cost by Amr Elmasry and Towards a Final Analysis of Pairing Heaps by Seth Pettie. It may be possible to apply some of this work to Okasaki's lazy pairing heaps.
  • Deterministic biased finger trees: In Biased Skip Lists, by Amitabha Bagchi, Adam L. Buchsbaum, and Michael T. Goodrich, a design is presented for deterministic biased skip lists. Through the skip list/tree transformation mentioned above, it may be possible to make deterministic biased search trees. The finger biased skip lists described by John Iacono and Özgür Özkan in Mergeable Dictionaries might then be possible on biased skip trees. A biased finger tree is suggested by Demaine et al. in their paper on purely functional tries (see above) as a way to reduce the time-and space bounds on finger update in tries.
  • The String B-Tree: A New Data Structure for String Search in External Memory and its Applications by Paolo Ferragina and Roberto Grossi is a well studied data structure combining the benefits of tries and B-trees.

|

share|cite|improve this answer

|

edited May 23 '17 at 11:33

|

community wiki

[ 41 revs, 5 users 94%
jbapple ]118

|

|
| |

  • 4

I don't remember checking the "community wiki" box on this answer. Is there any way to undo that? – jbapple Sep 22 '10 at 20:10

  • 7

@jbapple: after a certain number of edits, all posts become community wiki. That's an impressively thorough review there. Thank you. – Novelocrat Sep 23 '10 at 3:07

  • 28

Great list! Which makes me wish Okasaki would publish a second edition. – Radu GRIGore Sep 30 '10 at 10:33

  • 4

Note that Isabelle/HOL can generate code for SML, OCaml, Haskell, Scala. The Haskabelle tool can also import Haskell into Isabelle/HOL. – Makarius Mar 4 '13 at 11:31

  • 2

The terminology of "programm extraction" is one of Coq: you take a constructive proof and make an executable program from it, stripping away some things. In Isabelle this is called "code generation" and works differently, using the HOL specifications as pseudo-code, not the proofs. Proof extraction in Isabelle/HOL according to Berghofer does work like Coq, but is rarely used these days. – Makarius Mar 4 '13 at 11:34

 |  show 6 more comments

|

| ----- | |

up vote 59 down vote

|

To the excellent notes already made, I'll add Zippers.

Huet, Gerard. "Functional Pearl: The Zipper" Journal of Functional Programming 7 (5): 549-554, September 1997.

Wikipedia: Zipper (data structure)

|

share|cite|improve this answer

|

edited Nov 13 '13 at 3:11

Rory O'Kane

1033

|

answered Sep 21 '10 at 18:06

Matt Might

81956

|

|
| |

  • 4

Zippers are AWESOME. For many use cases, they allow tree based representations to become the "right" choice for many kinds of data where otherwise it'd be a bit more complicated – Carter Tazio Schonwald Dec 1 '10 at 22:52

  • 1

An example of their use for XML manipulation: anti-xml.org/zippers.html – Mechanical snail Aug 22 '12 at 20:29

add a comment | 

|

| ----- | |

up vote 36 down vote

|

Conchon, Filliatre, A Persistent UNION-FIND Data Structure and Semi-persistent Data Structures.

|

share|cite|improve this answer

|

edited Apr 26 '14 at 17:07

Huge

1033

|

answered Sep 21 '10 at 8:13

Radu GRIGore

3,9012265

|

|
| |

  • Wow, a persistent UNION-FIND! Thanks! – jkff Sep 21 '10 at 8:17
  • 3

Well, kind of... See the article. – Radu GRIGore Sep 21 '10 at 8:18

  • 1

... or, if you prefer, see some code (by Matt Parkinson) github.com/septract/jstar/blob/master/src/utils/… – Radu GRIGore Sep 21 '10 at 8:50

  • 5

Now I see why the "kind of.." comment had an upvote. They have good performance only when one almost exclusively either does not use persistence, or backtracks all the time: if you often use both "new" and "old" versions, you're screwed. Cool rerooting idea though. – jkff Sep 22 '10 at 11:13

add a comment | 

|

| ----- | |

up vote 18 down vote

|

I'd add McBride's version of zippers as derivatives of data types.

|

share|cite|improve this answer

|

answered Sep 22 '10 at 2:05

none

1812

|

|
| |

  • I love that stuff. It's just so cool that the derivative has an application so wildly different from finding rates of change! – SamB Sep 22 '10 at 22:00
  • 3

SamB, you might also be interested in derivatives of regular expressions (if you didn't already know about them). – jbapple Sep 22 '10 at 23:33

  • 3

Here's the original paper on derivatives of regular expressions. – jameshfisher Jun 16 '13 at 10:37

add a comment | 

|

| ----- | |

up vote 12 down vote

|

Rangemaps

It is a specialized data structure, but it can be used as a substitute for Martin Erwig's DIET, with slightly different properties, so at least there is one existing data structure to compare it to. The DIET itself was described in an article in JFP in 1998, so perhaps it is not included in Purely Functional Data Structures.

|

share|cite|improve this answer

|

answered Jan 28 '11 at 9:07

Complicated see bio

22124

|

|
| |

add a comment | 

|

| ----- | |

up vote 5 down vote

|

Following up on the 2012 paper linked above, the work on RRB vectors has since been extended and published in ICFP'15.

RRB vector: a practical general purpose immutable sequence http://dl.acm.org/citation.cfm?id=2784739

|

share|cite|improve this answer

|

answered Jun 28 '16 at 16:57

Mike Rainey

5111

|

|
| |

add a comment | 

|

Your Answer

 

draft saved

draft discarded

Sign up or log in

Sign up using Google

Sign up using Facebook

Sign up using Email and Password

Post as a guest

| ----- | |

Name

Email

|

Post as a guest

| ----- | |

Name

Email

|

discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged reference-request big-list ds.data-structures functional-programming or ask your own question.

| ----- | |

asked

|

7 years, 5 months ago

|
|

viewed

|

108,149 times

|
|

active

|

1 year, 7 months ago

|

Linked

[

64

]165 How important is knowing how to program for TCS?

[

49

]167 What are the outstanding questions in purely functional data structures?

[

18

]169 Math talk: Theorem about git revision control system?

[

13

]171 Difference lists in functional programming

[

10

]173 How do I choose a functional dictionary data structure?

[49

][167]What are the outstanding questions in purely functional data structures?

[12

][176]Applications of an access lemma for dynamic forests?

[17

][178]Sequences with sublogarithmic concat and approximate split

[-2

][180]Fortunes Algorithm - Beach Line Data Structure

[6

][182]Updates on a cache-oblivious B-tree

[6

][184]Purely functional uniquely-represented deques

[5

][186]Is there a purely functional vector with O(1) access to the front and back but O(log n) concatenation?

[9

][188]How to design concurrent data structures?

[8

][190]What are the must-read search trees paper?

[6

][192]Purely(ish) functional data structure with fast append and forward iteration

Hot Network Questions

question feed

Theoretical Computer Science
Company
[Stack Exchange

Network]3

site design / logo © 2018 Stack Exchange Inc; user contributions licensed under cc by-sa 3.0 with attribution required. rev 2018.2.23.29126

Theoretical Computer Science Stack Exchange works best with JavaScript enabled