ときどきの雑記帖 RE* (新南口)

Memories of Green

November 16, 2022

値上げ

ふと気になってMathematicaの値段をチェックしてみるとしっかり上がっていた (Mathematica Onlineで2400JPY/月→3000JPY/月)

家庭・趣味向けのMathematicaの価格：個人ライセンスオプション

file

ちょっと前の話題の、GitHub Copilotに関する話で

この「Filed」はなんだろう? と思ったので辞書をひくと

なるほど。

メモ

WSLでインストールしたUbuntuの不要になったものを消す手順。以前二段階かけて行うものを書いたけど一つの操作でできた(当然と言えば当然)。

Windows/WSL/環境構築/ディストリビューションのアンインストール - yanor.net/wiki

> wsl -l -v
（ディストリビューション名を確認する）
> wsl --unregister Ubuntu-18.04
（削除する）

C is like asbestos

Cを指して「アスベスト」というのは面白い表現だなあと思った。言いえて妙だ。

Nvidia Security Team: “What if we just stopped using C?” | Hacker News

C is like asbestos. It was fine at what it did, good performance, but the safety problems outweigh them. The difference is that we stopped using asbestos because it was unsafe. It’s still around but being replaced during renovations, and no new installations use it.

For whatever reason with C there’s this huge emotional component to it. Safer alternatives exist. You’d rightfully laugh at a contractor who suggested asbestos is fine, if you make sure to use only highly-skilled installers who patch up the drywall so that no fibers can escape. But with C we say that all the time, and the CVEs keep piling up.

ネタ元に出てくる

Back in 2018, a Proof-of-Concept (POC) exercise was conducted. Two low-level security-sensitive applications were converted from C to SPARK in only three months.

SPARKとはなんぞ? と思ったが

これか。なるほど記事が adacoreに置かれているのも納得。

C(やC++)が云々というのはここ数日でこんな話題も。

大元のネタは同じなのにブックマークの反応が結構違うのが以下略。

gawk csv

gawkのCSV対応の話のつづき。

コミットをざっと見ると「csv」がタイトルに含まれているのは以下のもの (上から新しい順。抜けはあるかもしれない)。

上の方はドキュメントなどの変更で、 CSVを扱うためのコードがあるのは一番下三つと下から5番目のコミット。最初のコミットを見るとこんな感じ(一部)

comma_parse_field --- CSV parsing same as BWK awk.

@@ -741,6 +764,98 @@ sc_parse_field(long up_to,	/* parse only up to this field number */
 }
 
 /*
+ * comma_parse_field --- CSV parsing same as BWK awk.
+ *
+ * This is called both from get_field() and from do_split()
+ * via (*parse_field)().  This variation is for when FS is a comma,
+ * we do very basic CSV parsing, the same as BWK awk.
+ */
+static long
+comma_parse_field(long up_to,	/* parse only up to this field number */
+	char **buf,	/* on input: string to parse; on output: point to start next */
+	int len,
+	NODE *fs,
+	Regexp *rp ATTRIBUTE_UNUSED,
+	Setfunc set,	/* routine to set the value of the parsed field */
+	NODE *n,
+	NODE *sep_arr,  /* array of field separators (maybe NULL) */
+	bool in_middle ATTRIBUTE_UNUSED)
+{
+	char *scan = *buf;
+	static const char comma = ',';
+	long nf = parse_high_water;
+	char *field;
+	char *end = scan + len;
+
+	static char *newfield = NULL;
+	static size_t buflen = 0;
+
+	if (newfield == NULL) {
+		emalloc(newfield, char *, BUFSIZ, "comma_parse_field");
+		buflen = BUFSIZ;
+	}
+
+	if (set == set_field)	// not an array element
+		set = set_comma_field;
+
+	if (up_to == UNLIMITED)
+		nf = 0;
+
+	if (len == 0) {
+		(*set)(++nf, newfield, 0L, n);
+		return nf;
+	}
+
+	for (; nf < up_to;) {
+		char *new_end = newfield;
+		memset(newfield, '\0', buflen);
+
+		while (*scan != comma && scan < end) {
+			if (*scan == '"') {
+				for (scan++; scan < end;) {
+					if (*scan == '"' && scan[1] == '"') {	// "" -> "
+						*new_end++ = '"';
+						scan += 2;
+					} else if (*scan == '"' && (scan == end-1 || scan[1] == comma)) {
+						// close of quoted string
+						scan++;
+						break;
+					} else {
+						// grow buffer if needed
+						*new_end++ = *scan++;
+					}
+				}
+			} else {
+				// unquoted field
+				while (*scan != comma && scan < end) {
+					// grow buffer if needed
+					*new_end++ = *scan++;
+				}
+			}
+		}
+
+		(*set)(++nf, newfield, (long)(new_end - newfield), n);
+
+		if (scan == end)
+			break;
+
+		if (scan == *buf) {
+			scan++;
+			continue;
+		}
+
+		scan++;
+		if (scan == end) {	/* FS at end of record */
+			(*set)(++nf, newfield, 0L, n);
+			break;
+		}
+	}
+
+	*buf = scan;
+	return nf;
+}
+
+/*

ふむ。

ん、改行を含むフィールドの扱いはどうなるんだろう?

awkcc

awkccという文字列も見かけたのでひょっとして…? と思ったがドキュメント(マニュアル)で言及しただけらしい。

Document awkcc. gawk.git - gawk

+'awkcc'
+     This is an early adaptation of Unix 'awk' that translates 'awk'
+     into C code.  It was done by J. Christropher Ramming at Bell Labs,
+     circa 1988.  It's available at <https://github.com/nokia/awkcc>.
+     Bringing this up to date would be an interesting software
+     engineering exercise.
+

Performance of perl interpreter

Perlも性能が向上してるんですよという記事で

正規表現の操作の数値を見ると

Perl Version:	5.12	5.16	5.20	5.24	5.28	5.32	5.36
Regex/Replace:	8.89	8.42	7.13	6.55	6.03	4.63	5.21
Regex/Replace utf8:	17.24	15.64	11.46	10.04	11.16	10.56	10.76

utf8でない方は5.32まで単調に性能が向上している(数値が小さくなっている) のにutf8は5.24→5.28で悪化しているのはなぜだとか、 5.36で両方とも悪化している (非utf8の方がその幅が大きい)のはなぜとか気になる(が、調べない)。

gcc extension

とある検索をしていたときに gcc(Gnu C Compiler)の拡張文法というページが引っかかった。

目的とするものはそのページにはなかったのけど、目的とは違うところで興味深い記述が目についた。

このページではgcc独自のC/C++拡張文法について解説します。これらの拡張文法が可能にする機構は確かに便利なのですが、もちろんANSI規格に従っていないので、一般的には使うべきではありません。

配列変数をコピーする。

gccでは同型で同じサイズの配列変数間に限り、次のようにして配列変数の全要素をコピーすることができます。
   int a[8], b[8]="StringB";
   a = b;                      // GCC Extension?
しかし gccの提供するinfoにはこの機能についての説明がありません。

え、なにこれ?

ということでちょっと調べてみた。

kbk@toybox4:~$ cat >ary.c
#include <stdio.h>
#include <stdlib.h>
int
main()
{
        char ary1[4] = {'a','b','c',0};
        char ary2[4] = {0};
        ary2 = ary1;
        printf("%s", ary2);
}
kbk@toybox4:~$ gcc -O0 ary.c
ary.c: In function ‘main’:
ary.c:8:7: error: assignment to expression with array type
    8 |  ary2 = ary1;
      |       ^

ふむ。当然と言えば当然の結果。ひょっとしてCではなくC++でなら? と、g++で試してみても

kbk@toybox4:~$ g++ -O0 ary.c
ary.c: In function ‘int main()’:
ary.c:8:9: error: invalid array assignment
    8 |  ary2 = ary1;
      |         ^~~~
kbk@toybox4:~$

×(Cとエラーが微妙に違うのが興味深い)。

それではといろいろ検索してみたところ、どうも(過去の)gccのバグということらしい。

c - A legal array assignment. Is it possible? - Stack Overflow

@FarouqJouti Apparently it was fixed. With gcc 5.8.0 and later it gets a fatal error error: assignment to expression with array type, even with -std=gnu89. With gcc 4.1.2, it compiles without error and “works”, and if you change char c[20]; to char c[21]; it complains error: incompatible types in assignment. I don’t see anything in the gcc 4.1.2 manual about this being a documented extension. – Keith Thompson Jun 11, 2020 at 2:02

んで問題の記事の日付を見ると

Copyright(C) by Naoki Watanabe. Oct 21st, 1995.

なので、このころのバージョンは2.x?

まあ、ページの作者も世紀を跨いでツッコミが入るとは思ってなかっただろうな😄

FORTRAN Compiler on IBM 704

B

定数に出てくるBの話のつづき。

Coding_for_the_MIT-IBM_704_Computer_Oct57.pdf のp.149 The SHARE Assembler にこんな記述があった。

If the character B appears in a decimal data word, the word is converted as a fixed point binary quantity. The binary scale factor used in this conversiion is the number which follows immediately after the character B; this number begin the number of binary places between the left-hand end binary result. If the decimal point does not appear in the decimal data word, it is assumed to be at the right-hand end. The decimal exponent used in this conversion is the number which follows immediately after the character E. The order of B and E is not siginificant. For example, 12.345B4, +1.2345E1B4, and 12345B4E-3 are all equivalent representations of the same fixed point quantity.

左方向のシフトかと思ったら右方向へのシフトを指示するものだったらしい。

M1CON  DEC 0,1,2,3,4,10B17,1B17,2B17,4B17,8B17,17B17,18B17,35B17        F3B11740

を例にとるとこれらの定数のアセンブル結果は

+000000000000
+000000000001
+000000000002
+000000000003
+000000000004
+000012000000
+000001000000
+000002000000
+000004000000
+000010000000
+000021000000
+000022000000
+000043000000

となっているのだけど、確かにそれっぽい、

小数はちょっとわかりづらかったけど

アセンブル結果	ソース上の表記
000000000021	.5BE-9
000000000253	.5BE-8
000000003265	.5BE-7
000000041433	.5BE-6
000000517426	.5BE-5
000006433342	.5BE-4
000101422335	.5BE-3
001217270243	.5BE-2
014631463146	.5BE-1
200000000000	.5BE

こんなやっつけスクリプト

def get_bits(v, e)
  m = 10**e
  bits = 0
  1.upto(35) do |n|
    v *= 2
    bits = (bits << 1) | (v>=m ? 1 : 0)
    v %= m
  end
  bits
end

9.downto(1) do |n|
  printf "%d %012o\n", n, get_bits(5, n+1)
end

で確認すると

9 000000000021
8 000000000253
7 000000003265
6 000000041433
5 000000517426
4 000006433342
3 000101422335
2 001217270243
1 014631463146

なるほど。

≪ prev さようなら、いままでビットをありがとう

next ≫ Tales of The Future