Xah Talk Show 2022-12-22 Advent of Code Day 7, in WolframLang, Live Coding
problem spec
given this input, built a tree structure, with size info for each file.
$ cd / $ ls dir a 14848514 b.txt 8504156 c.dat dir d $ cd a $ ls dir e 29116 f 2557 g 62596 h.lst $ cd e $ ls 584 i $ cd .. $ cd .. $ cd d $ ls 4060174 j 8033020 d.log 5626152 d.ext 7214296 k
those starting $ means it's a command. possible commands are:
$ cd name
$ cd /
$ cd ..
$ ls
possible output lines:
dir dname
number fname
problem 1: find all dirs whose size is less than or equal to 100k. get the sum of their sizes.
notes, analysis of the problem
- whenever u see
$ cd /
, set xcurDir cd always comes in 1, or sequence of cd. after the last cd, then there's an ls.
- built a tree. each (non-leaf) node has a name and size. each leaf node has a name and size.
- how do we want to represent a tree. 2 major ways to represent a tree. a nested dictionary (key value pairs), or nested list where 1st item is meta info.
- if we use a tree:
- for each node, the first item is metaInfo: {metaInfo, childrenList }
- metaInfo is a list, {name, size, type} e.g. {"ab", 741, "file"}, {"/", 573, "dir"}
- type is one of "dir", "file"
- node: {metaInfo, children }
- children is a list of nodes
- two ways to represent a tree. nested dict, or nested list.
- if we use dict, then, it's easy to represent the position we want to add a new node.
- and that's just a list of keys, or dir names.
- dict[key1][key2][key3] = value
- if we use list, then, we need a list of position corresponding to the current dir.
- like this:
- list[index1][index2][index3] = value
- this is hard, because we need to maintain, how dir names corresponds to the indexes
- problems of storing the meta info, e.g. file or dir name, and size
- when using a nested dict to represent a dir,
- we can have a separate dict, where keys are file path, and value is size.
- this way, we can easily, find any node's size.
xinput = "$ cd / $ ls dir btsgrbd 3868 cprq.fmm dir gcbpcf dir hfm 324644 lthcng.gnf 133181 nblfzrb.mrr 140568 sfrbjmmh.jnj dir tfsh dir vlsqgrw 202279 vmpgqbcd $ cd btsgrbd $ ls dir cmfdm dir cqd dir gvwvs dir nblfzrb dir nfm 293979 qwnml.bqn 159220 sdwnsgwv.mjm 327978 vzgwwjq.zbp 155479 zvspnvfr.zbc $ cd cmfdm $ ls dir gldnjj dir vhf $ cd gldnjj $ ls dir dvght 93750 lwvtzd.pws 176529 sdwnsgwv.mjm 100111 vmpgqbcd $ cd dvght $ ls dir tfbzq $ cd tfbzq $ ls 276592 tcghw.srg $ cd .. $ cd .. $ cd .. $ cd vhf $ ls 240217 hfm.rfp dir nblfzrb $ cd nblfzrb $ ls 160378 jhc $ cd .. $ cd .. $ cd .. $ cd cqd $ ls 305358 bnddfgrb dir dwqncqp dir hnnfdtbh dir jhc dir nblfzrb 327762 scnm.qbf 165080 vmpgqbcd 190041 vzgwwjq.zbp dir zwv $ cd dwqncqp $ ls 122570 slpgmhv 278461 zlnbcwr $ cd .. $ cd hnnfdtbh $ ls 334830 gfprhn.rjj $ cd .. $ cd jhc $ ls 179593 fgb.btb $ cd .. $ cd nblfzrb $ ls dir clbcgvhc dir jhc dir lsrnz dir mctd $ cd clbcgvhc $ ls 285825 hnn 238272 nblfzrb.wvr $ cd .. $ cd jhc $ ls 99731 nblfzrb.svz $ cd .. $ cd lsrnz $ ls 257843 fsthnpmd $ cd .. $ cd mctd $ ls 278117 zlnbcwr $ cd .. $ cd .. $ cd zwv $ ls 40349 jhc dir pqwml 173804 sdwnsgwv.mjm $ cd pqwml $ ls 193573 hbzvzwpr $ cd .. $ cd .. $ cd .. $ cd gvwvs $ ls dir gjslw dir gwz dir ljvrjp dir sltlpb dir vbsnq $ cd gjslw $ ls dir gzbm $ cd gzbm $ ls dir fst dir gpjz dir gzd dir hfm $ cd fst $ ls 99806 mqpg $ cd .. $ cd gpjz $ ls dir dnsvsp 218828 jhc.dfd $ cd dnsvsp $ ls dir vmdbpwj dir zvspnvfr $ cd vmdbpwj $ ls 258373 jhc $ cd .. $ cd zvspnvfr $ ls 18241 vzgwwjq.zbp $ cd .. $ cd .. $ cd .. $ cd gzd $ ls 20383 chdfwj 63309 prrlv.rvn"; xlines = StringSplit[xinput, "\n" ]; (* process each line one at a time. and built list of path first. then turn the paths into a tree. *) (* each element is again a list, of dir/file names *) xpaths = {}; xLinePartsList = Map[ StringSplit , xlines ]; (* e.g. each is like this *) (* { "$", "cd", "e"} *) (* { "8033020" "d.log" } *) (* a list of dir names *) curDir = {}; sizeLookupTable = Association[]; dirTree = Association[]; Map[Function[{xx}, If[xx[[1]] === "$", (*line is a cmd*) (If[xx[[2]] === "cd", (* is cd*) If[xx[[3]] === "..", (curDir = Drop[curDir, -1]), (AppendTo[curDir, xx[[3]]])]; AppendTo[xpaths, curDir], (*it is ls*) Null]), (*line is not a cmd. it means it's size and filename, or dir and filename. if dir, ignore. if file, add to sizeLookupTable.*) If[ xx[[1]] === "dir" , Null, With[{fsize = xx[[1]], filename = xx[[2]], filepath = (Append[curDir, xx[[2]]]) }, AppendTo[xpaths, filepath]; AssociateTo[sizeLookupTable, filepath -> fsize]]] ]], xLinePartsList]; Map[ Function[{x}, expr] , xpaths ] (* dirTree[key][key][key] = val *)
misc notes
discoverd a logical flaw in unix path representation.
e.g. in
/a/b/c
,
a more logical way is
//a/b/c
,
because the root is a slash, and slash is also separator.