How does a text editor works on a local machine?

Naive Approach: Using a String

  • You could treat the document as one big string and insert/delete characters by creating new strings each time.

  • But many languages make strings immutable, so every modification creates a new string → slow, memory-inefficient, causes fragmentation.

Better Approach: Array of Characters

Represent text as an array:

  • Adding/removing at the end = fast

  • Overwriting = fast

  • But inserting in the middle = slow, because you must shift all following characters by one position.

  • With large documents (thousands/millions of characters), this becomes extremely inefficient.

Can we use linked lists or more accurately doubly linked lists?

Idea: represent each character as a node in a linked list.

Pros:

  • Easy insert/delete in the middle (just adjust pointers)

Cons:

  • Massive memory overhead:

    • Each character = 1 byte

    • Pointer(s) = 8–16 bytes

    • → Text becomes 17x larger in memory

  • Not cache-friendly

  • Slow for random access

Hybrid approach: linked list of lines (e.g., old vi editor). Works but still not optimal.

The Key Insight

Insertion is expensive only when characters must be shifted.
But users type in bursts:

  • Move cursor

  • Then type multiple characters

  • Then move cursor again

So… what if the editor prepared space in advance at the cursor?


Solution: The Gap Buffer

A gap buffer represents the document as:

[ text before cursor | GAP | text after cursor ]
  • The editor keeps an empty block (“gap”) where the cursor is.

  • Inserting characters means writing into the gap—no shifting needed.

  • When the cursor moves somewhere else:

    • The editor shifts characters once to move the gap there.

    • This “one big shift” is hidden while the user is moving the cursor and not typing.

Data structure stores:

  • The character array

  • Gap start index

  • Gap end index

  • Total buffer size

Insert

  • Write into the gap start

  • Move gap start forward by one

Delete

  • Just move gap end backward (no real removal needed)

Accessing characters

  • If index < gap_start → read normally

  • Else → offset by gap size to skip the empty region

Saving the file

  • Ignore the gap; concatenate text before and after it.


Visualization:

Consider, the cursor is after "be".

Inserting character shrinks the gap

Buffer (size = 20)

Index:  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
        -------------------------------------------------
Chars: | T | o |   | b | e | _ | _ | _ | _ | _ | n  | o  | t  |   | t  | o  |   |  b |  e | 
                                  ↑— GAP ——↑
                                   gap_start = 6        
                                    gap_end = 11



After moving cursor after "not":

Index:  0 1 2 3 4 5 6 7 8 9 10 11 12 13 ...
        -----------------------------------------------------------
Chars: | T | o |   | b | e | b | e | n | o | t | _ | _ | _ | _ | _ |
                                                         ↑—— GAP ——↑
                                                         gap_start=10     gap_end=15

When the gap becomes too small, the editor resizes the entire buffer — similar to how ArrayList or Vector resizes internally.

Comments

Popular posts from this blog

Design - URL Shortener

Design - Gaming Leaderboard

Design - Booking System