This module implements an algorithm to compute the diff between two sequences of lines.
- To learn more see Diff on Wikipedia.
Example:
import experimental/diff assert diffInt( [0, 1, 2, 3, 4, 5, 6, 7, 8], [-1, 1, 2, 3, 4, 5, 666, 7, 42]) == @[Item(startA: 0, startB: 0, deletedA: 1, insertedB: 1), Item(startA: 6, startB: 6, deletedA: 1, insertedB: 1), Item(startA: 8, startB: 8, deletedA: 1, insertedB: 1)]
Example:
import experimental/diff # 2 samples of text (from "The Call of Cthulhu" by Lovecraft) let txt0 = """ abc def ghi jkl2""" let txt1 = """ bacx abc def ghi jkl""" assert diffText(txt0, txt1) == @[Item(startA: 0, startB: 0, deletedA: 0, insertedB: 1), Item(startA: 2, startB: 3, deletedA: 1, insertedB: 1)]
Types
Item = object startA*: int ## Start Line number in Data A. startB*: int ## Start Line number in Data B. deletedA*: int ## Number of changes in Data A. insertedB*: int ## Number of changes in Data B.
- An Item in the list of differences. Source Edit
SeqEdit = object kind*: SeqEditKind ## Sequence edit operation kind sourcePos*: int ## Position in the original sequence targetPos*: int ## Position in the target sequence
- Sequence edit operation. Source Edit
SeqEditKind = enum sekNone, ## Empty edit operation sekKeep, ## Keep original element unchanged sekInsert, ## Insert new element into target sequence sekReplace, ## Replace source element with the target sekDelete, ## Delete element from the source sequence sekTranspose ## Transpose two elements
- Kind of the sequence edit operation Source Edit
ShiftedDiff = object oldShifted*: seq[tuple[kind: SeqEditKind, item: int]] newShifted*: seq[tuple[kind: SeqEditKind, item: int]]
- Intermediate version of the diffed sequence Source Edit
Procs
proc diffInt(arrayA, arrayB: openArray[int]): seq[Item] {....raises: [], tags: [].}
-
Find the difference in 2 arrays of integers.
arrayA A-version of the numbers (usually the old one)
arrayB B-version of the numbers (usually the new one)
Returns a sequence of Items that describe the differences.
Source Edit proc diffText(text1, text2: string; sideBySide: bool; showLineNumbers: bool = false): string {....raises: [KeyError], tags: [].}
- Format diff of two text blocks via newline split and default formatDiffed implementation Source Edit
proc diffText(textA, textB: string): seq[Item] {....raises: [KeyError], tags: [].}
-
Find the difference in 2 text documents, comparing by textlines.
The algorithm itself is comparing 2 arrays of numbers so when comparing 2 text documents each line is converted into a (hash) number. This hash-value is computed by storing all textlines into a common hashtable so i can find duplicates in there, and generating a new number each time a new textline is inserted.
textA A-version of the text (usually the old one)
textB B-version of the text (usually the new one)
Returns a seq of Items that describe the differences.
Source Edit proc formatDiffed(shifted: ShiftedDiff; oldSeq, newSeq: seq[string]; sideBySide: bool; showLineNumbers: bool = false): string {. ...raises: [], tags: [].}
-
Plaintext format diff edit script for printing. Provides pretty barebones implementation of the formatting - no coloring, diff formatting is not configurable.
Generated diff formatting does not contain trailing newline
- sideBySide: stack modified lines on top of each other (unified diff) or side-by-side (split diff)
- oldSeq, newSeq: original diffed sequences with items converted to strings. It is assumed that each item is placed on one line and diffed between each other.
- showLineNumbers: Show original line (sequence) numbers in the generated string printout.
proc levenshteinDistance[T](str1, str2: openArray[T]): tuple[distance: int, operations: seq[SeqEdit]]
-
Compute edit distance between two item sequences, return list of edit operations necessary to transform str1 into str2
Adapted from https://phiresky.github.io/levenshtein-demo/
Source Edit proc myersDiff[T](aSeq, bSeq: openArray[T]): seq[SeqEdit]
- Diff overload without explicit comparator proc - use default == for two items. Source Edit
proc myersDiff[T](aSeq, bSeq: openArray[T]; itemCmp: proc (x, y: T): bool): seq[ SeqEdit]
-
Generate series of sequence edit operations necessary to trasnform aSeq into bSeq. For item equality comparison use itemCmp
https://gist.github.com/adamnew123456/37923cf53f51d6b9af32a539cdfa7cc4
Source Edit proc shiftDiffed[T](diff: seq[SeqEdit]; oldSeq, newSeq: openArray[T]): ShiftedDiff
- Align diff operations against each other, for further formatting. Source Edit
Iterators
iterator zipToMax[T](lhs, rhs: seq[T]; fill: T = default(T)): tuple[lhs, rhs: T, rhsDefault, lhsDefault: bool, idx: int]
- Iterate each argument to the end, filling in missing values with fill argument. This is an opposite of the std built-in zip which iterates up until min(lhs.len, rhs.len). Source Edit