My programs have defects because some functions destroy or mutate shared data. I avoid mutating shared data in Common Lisp; for example, I use CONS. However, I made a mistake by using SORT, which wasn't aware that SORT is destructive. Sometimes I still forget that SORT mutates its input data. The function sort-snode below sorts a part of a tree.
Since it is based on SORT, it changes the input data, which is snode. After I found the output tree didn't look what I expected, I took days to see where my mistake was.
My workaround is running COPY-LIST before SORT, as shown below.
Submitted to the AI Incident Database on 25 October 2022 (my first time!). Based on a report by Thai PBS on 4 October 2019, with information from additional sources. Appeared in the database on 26 October 2022. I will keep the extended report here for archival purpose. For a concise report and citation, please link to Incident 375 in the AI Incident Database.
—
Lots of Thais cannot register for the government cash handout scheme as the app managing government wallet failed to recognize their faces during the authentication process. People entitled to the handout have to wait for a very long queue at their local ATMs instead to get authenticated.
The handout is limited to 10 million recipients in the first round and a recipient has to register to claim it. Thailand has almost 70 million population. The registration involved the authentication process of photo taking the citizen identification card and the face of the card holder. If not successful, the citizen can do it at a supported ATM machine. There are about 3,000 ATMs that support the process nationwide.
On social media, internet users share the problems they had, screenshots of messages from the app, and also photo taking techniques that may pleased the facial recognition. The tips include applying face powder makeup, put the hair up, take off eyeglasses, take the photo during daytime in the sunlight, look straight, make the face and comb the hair to match one in the ID card, and avoid having shadow on the ID card.
Elder people are one of the groups that suffer the most from the facial recognition issue, as their current faces can be more different from ones in their ID cards. Thai citizen ID card law said people who are 70 years old or more are no longer need to renew their cards, which normally requires to be renewed every eight years. Because of this, the face on the ID card and the actual face can be very different.
Background
The cash handout program, called “Chim, Shop, Chai” (ชิมช้อปใช้ roughly translated as “eat, buy, spend”) is aimed to promote domestic tourism.
The same government wallet, “G Wallet”, will be used later for many other rounds of cash handout and co-pay programs to come during the COVID-19 pandemic, such as “Khon La Khrueng” (คนละครึ่ง, “each pay half”) – where the same facial recognition issue still occurs.
Total number of individuals registered for these financial support programs is around 26.5 million. The wallet itself is inside “Paotang” (เป๋าตัง) super app, developed and managed by a state enterprise Krungthai Bank. Paotang has 34 million active users in June 2022.
Suriyawongkul, Arthit. (2019-09-29) Incident Number 375. in Lam, K. (ed.) Artificial Intelligence Incident Database. Responsible AI Collaborative. Retrieved on October 26, 2022 from incidentdatabase.ai/cite/375.
AI for All ชุดโครงการปัญญาประดิษฐ์/วิทยาการหุ่นยนต์สำหรับทุกคน ได้รับการสนับสนุนจากสำนักงานสภานโยบายการอุดมศึกษา วิทยาศาสตร์ วิจัยและนวัตกรรมแห่งชาติ (สอวช.) มีส่วนของบทความและสรุปข่าวประจำเดือน ซึ่งเนื้อหาบางส่วนเกี่ยวข้องกับประเด็นข้อกังวลและกรอบกติกาเมื่อนำระบบปัญญาประดิษฐ์มาใช้ในสังคม
Many teams don't test their functions separately. They run the whole project to see the result. When something goes wrong, they check the log and use a debugger. They are the majority, at least from my experience. Static type checking and type annotations are efficient for these teams because type annotations give a rough idea about data for each function. They can't look at testing data, which doesn't exist.
Still, I wonder if forcing type annotation is practical. Reading a long function is difficult. By splitting a long function, I found that type annotation can be distracting because instead of focusing on logic, I have to think about type annotation; sometimes, the size of type annotation is about half the function's size. Maybe forcing type annotation only on functions, which an outsider from another module can call, like in OCaml, is practical. I haven't coded in OCaml beyond some toy programs. So I don't know if it is really practical as I imagine.
My computer has many cores but doesn't have enough RAM to store the whole data. So I usually need to process data from a stream parallelly, for example, reading a Thai text line by line and dispatch them to processors running on each core.
In Clojure, I can use pmap because pmap works on a lazy sequence. However, using Lparallel's pmap on Common Lisp with lazy sequences wasn't in the example. So I used Bordeaux threads and ChanL channels instead. It worked. Still, I had to write repetitive code to handle threads and channels whenever I wanted to process the stream parallelly. It didn't only look messy, but it came with many bugs.
So I created a small library, called stream-par-procs, to wrap threads and channels management. As shown in the diagram, the reader reads a line for the stream, the system dispatch a line to different processors, and finally, the collector creates the final result.
I only need to define a function to be run in the processor, another function for running in the collector, and other functions for initializing states. It hides details, for example, joining collector thread when every processor sent END-OF-STREAM.
In brief, stream-par-procs makes processing a stream parallelly in Common Lisp more convenient with hopefully fewer bugs by reusing threads and channels management code.
Fetching data from REST APIs or a plain HTTP server is usually a part of my work. Many problems may arise during fetching data; for example, a server goes offline, my storage is out of space, an API gives the wrong URL, a server sends data in an invalid format, etc. To illustrate the situation, I can oversimplify my task to the function below:
deffetch(uri):requests.get(uri)
A fetcher usually faces one of these problems. I don't want to rerun a fetcher after running it for an hour or a day. Maybe I can write my fetcher to resume working after it stops at any state, but it will take time, and my fetcher will be more complex. If I code in Python, the fetch function can raise many exceptions, and my fetcher will stop because my code didn't handle any exceptions.
I see two types of environments that will help me fix the problem.
Type checker may help me handle as many types of exceptions as possible upfront. Java definitely can do this because, in Java, I have to declare the list of exceptions as a part of the fetch function, and Javac checks if the caller handles these exceptions. Rust doesn't use a Java-like exception, but the result type encodes the types of errors, so the Rust compiler can check if a caller handles these errors.
Common Lisp runtime will run a debugger if unhandled errors occur. So I can choose what to do next. For example, I can empty my trash folder if my storage is out of space. I still don't know if I can handle the storage space problem later with Erlang runtime.
Python doesn't seem to belong to (1) nor (2). I tried mypy, a static type checker for Python, but it didn't warn me anything. And I don't know how to retry or do something else with Python's exception system rather than let it stop.
I ran the average-1000-etipitaka-flexi-streams on SBCL 2.2.5-1.1-suse on my laptop with Celeron N4500. It took 1.591 seconds.
Then I change the file to my-data.ndjson.zst, whose average line length is 515.5 bytes. Running average-1000-ndjson-flexi-streams took 4.411 seconds.
So I also tested with my customized utf8-input-stream. Running average-1000-etipitaka-utf8-input-stream, and average-1000-ndjson-utf8-input-stream took 0.019 seconds, and 0.043 seconds respectively, which means utf8-input-stream is 83X faster for short lines, and 102X faster for long lines, than Flexi-streams in these tests.
I have a big line-separated JSON file, compressed in Zstd format. I will call this file data.ndjson.zst.
I want my program to read the file line by line. I can use https://github.com/glv2/cl-zstd to make a binary stream from the file. Still, I can run the function read-line of a binary stream. So I need to wrap the binary stream with Flexi-stream.
Printing those lines is a realistic example. So you can replace (print line) with your practical applications, for example, parse a JSON line and extract a book title.
I want to use Emacs 28.1 on aging OS that I mustn't modify /usr. I tried Docker but I want to use local command too. SBCL didn't work properly on old Docker. So I install Emacs from a source tarball to my home directory.
However, Emacs failed to verify TLS certs. So I installed GNUTLS, Nettle, Idn, and Unistring.
#!/bin/bashexport PKG_CONFIG_PATH=$HOME/lib64/pkgconfig:$HOME/lib/pkgconfig
export LD_RUN_PATH=$HOME/lib:$HOME/lib64
export LD_LIBRARY_PATH=$HOME/lib:$HOME/lib64
export LDFLAGS="-L$HOME/lib -L$HOME/lib64"rm-rf libunistring-1.0 libidn2-2.3.2 nettle-3.6 gmp-6.2.1 emacs-28.1
curl https://ftp.gnu.org/gnu/libunistring/libunistring-1.0.tar.gz | tar-xzvf - &&\pushd libunistring-1.0 &&\
./configure --prefix=$HOME&&\
make -j`nproc`&&\
make install&&\popd||exit 1
curl https://ftp.gnu.org/gnu/libidn/libidn2-2.3.2.tar.gz | tar-xzvf - &&\pushd libidn2-2.3.2 &&\
./configure --prefix=$HOME&&\
make -j`nproc`&&\
make install&&\popd||exit 1
curl https://ftp.gnu.org/gnu/nettle/nettle-3.6.tar.gz | tar xzvf - &&\pushd nettle-3.6 &&\
./configure --prefix=$HOME--enable-mini-gmp&&\
make -j`nproc`&&\
make install&&\popd||exit 1
curl https://www.gnupg.org/ftp/gcrypt/gnutls/v3.7/gnutls-3.7.6.tar.xz | tar xJvf - &&\pushd gnutls-3.7.6 &&\
./configure --prefix=$HOME&&\
make -j`nproc`&&\
make install&&\cd .. ||exit 1
curl http://ftp.gnu.org/pub/gnu/emacs/emacs-28.1.tar.gz | tar-xzvf - &&pushd emacs-28.1 &&\
./configure --prefix=$HOME--with-x-toolkit=no --with-xpm=no --with-jpeg=no --with-jpeg=no --with-gif=no --with-tiff=no --with-png=no &&\
make -j`nproc`&&\
make install&&\cd .. ||exit 1
echo
echo
echo"Emacs 28.1 must be ready!"echo
The worst thing I did in Rust was using unbounded asynchronous channels. It took days to figure out that my program had backpressure, and channels ate up all RAM.
So, since then, I have always used sync_channel or bounded channels from Crossbeam.
I was in a world where a code describing type signature was longer than a code telling what a program does. I don't want to bring that situation back.
I expect AI code completion to replace static typing. Ruby Type Signature (RBS) and similar things must be inferred automatically from a database or other data sources.
; caught WARNING:
; Derived type of STATIC1::A is
; (VALUES FIXNUM &OPTIONAL),
; conflicting with its asserted type
; STRING.
; See also:
; The SBCL Manual, Node "Handling of Types"
So SBCL - a Common Lisp implementation, can check type in compile-time. Anyway, a programmer needs to read warnings.
Last week, I watched a video about coding and Docker configuration on a TV. I couldn't read any line of code. Then I thought about visually impaired people. How do they code? Every student in Thailand must learn to code. I presume the situation is similar in every country.
People widely use text-to-speech services these days. I cannot find any text-to-speech service for reading source code aloud. Let's assume we have a modified version of a text-to-speech service.
So I looked at source codes in different programming languages on the CodeRosetta website. I perceive Python code blocks by their visual structure purely. To read Python source code, I have to encode its visual structure, namely indents to sounds. Reading a nested code block won't be easy to understand. For example, reading twelve leading white spaces aloud will be very strange. In Lisp, reading open parenthesis and close parenthesis is more straightforward, but I will forget which parenthesis. So the best form of code blocks is in QuickBasic, which has different keywords between different kinds of blocks. For example, FOR with NEXT, and IF with END IF. Later I got a comment from Lemmy.ml, which told me that Ada also has different keywords between different kinds of blocks. Another idea from Lemmy.ml is the reader must convert the Python code block into a similar form as Ada or QuickBasic before reading.
MBasic refers to code by line numbers instead of code blocks. However, by listening to five lines, I forgot the line number. For example, when I heard gosub 70, I forgot what was at line 70.
In X86 Assembly, a programmer labels only the line that the program will jump to it. So X86 Assembly code looks much better than MBasic.
Still, coding in X86 Assembly can be exhaustive in many cases. For example, X86 Assembly doesn’t support recursion. Writing quick sort in X86 Assembly can be too difficult for learning to code.
Haskell doesn’t rely on code blocks. However, reading the symbols, for example, >>= is challenging. Prolog’s symbols are easy to read. For example, we can read :- as IF. Anyway, the Prolog programming paradigm is different from the mainstream one now. So Erlang, whose syntax is similar to Prolog, is a more practical alternative.
In brief, Erlang is a practical, less visual-centric program language. Because it mostly relies on names instead of code blocks, reading names aloud is much easier than reading code blocks aloud. Furthermore, the Erlang programming paradigm is more mainstream now.
#################### F1 ###################
Evaluation count : 6 in 6 samples of 1 calls.
Execution time sample mean : 3.811858 sec
Execution time mean : 3.812064 sec
#################### F2 ###################
Evaluation count : 6 in 6 samples of 1 calls.
Execution time sample mean : 1.490624 sec
Execution time mean : 1.490777 sec
F1, which is the lazy sequence version, took 3.812064 seconds. F2, which is the transducer version, took 1.490777. So the transducer version is 155.71% faster than the lazy sequence version.
In brief, this biased experiment shows the transducer version is much faster than the pure lazy sequence version.
I recommend Clojure as the first programming language to learn for beginners. I recommend "Poetry of Programming" as the first book for learning Clojure. I recommend "Clojure for the Brave and True" as the second book for learning Clojure.
Lisp taught me to create functions that transform data step by step instead of adding mutable variables into loops. An immutable data structure is important for transforming data step by step without messing up existing data. Lisp's default data structure is a singly linked list.
Lisp macro is easy to write compared to competitors. Rust's macro is powerful. Rust's macro is about transforming a TokenStream, and #Lisp's macro is about transforming a list. However, I prefer transforming a list to a TokenStream since I don't have to learn new structures and new commands for manipulating a new structure.
And it has too many "return" statements. Anyway, I suppose Python doesn’t prefer a recursion. So this isn’t the case when we code in the Python way.
The code above concatenate Python's lists instead of creating a linked list like in Lisp.
So I made this experiment.
importtimen=100# Concatinating vectors
v=[]t1=time.time_ns()foriinrange(n):v=[i]+vt2=time.time_ns()# Creating cons cells for a singly link list
l=Nonet3=time.time_ns()foriinrange(n):l=(i,l)# Creating a cons cell
t4=time.time_ns()print((t2-t1)/(t4-t3))
At n = 100, concatenating vectors require 2 times of running time compared to creating cons cells for a linked list. So I keep using linked lists.
I want to install a specific version of RocksDB across different machines with different operating systems. Thus I can't rely on APT or other package management systems. I followed INSTALL.md by running make on the bundled Makefile, but last Saturday, I found that RocksDB couldn't read Snappy compressed data, although I installed the "libsnappy-dev" package. I tried many different ways to enable Snappy support. Then I decided to use CMake, which appeared only once in INSTALL.md. Now it works. My install script looks like this:
แต่ถ้า Fields มีเยอะ ๆ สัก 10 Fields ขึ้นไป Code ก็จะเริ่มดูยากยาวมาก การใช้ Pydantic model เข้ามาช่วยก็จะจัดการง่ายขึ้น ปกติเราใช้ Pydantic model ใน Body แบบ JSON กันอยู่แล้วแต่สำหรับ Form Data จะมีลูกเล่นนิดหนึ่ง ค้นหาใน internet ไปเจอวิธีที่คิดว่าง่ายที่สุดละเลยบันทึกไว้สักหน่อย