2018-12-10

Transactional Information Systems 5章 MVCC勉強会　第二回　議論メモ

勉強会について

Transactional Information Systems 5章 MVCC勉強会　第二回の議論メモです。

自分のメモをベースにまとめています。発言の聞き間違い、解釈違いの可能性があることをご了承ください。

特記の無い引用は本で議論した箇所を示しています。

Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery (The Morgan Kaufmann Series in Data Management Systems)

作者: Gerhard Weikum,Gottfried Vossen
出版社/メーカー: Morgan Kaufmann
発売日: 2001/05/24
メディア: ハードカバー
クリック: 2回
この商品を含むブログ (2件) を見る

[memo]は私のメモです。勉強会で議論された内容ではないことにご注意ください。
また、関連資料・引用・リンクを適宜挿入しています。

5.2 Multiversion Schedules

An example of a monoversion schedule is
m = r₁(x₀)w₁(x₁)r₂(x₁)w₂(y₂)r₁(y₂)w₁(z₁)c₁c₂

monoversion (1V) は version が1つだけ。read は直前に write したものを読む。

1V と serial schedule を混同してはいけない。これらは異なる。

5.3 Multiversion Serializability

devising a notion of view serializability and investigating whether it is feasible. Not surprisingly, it will again exhibit an NP complete decision problem, which motivates to look for a notion of conflict serializability.

View Serializability (VSR) はNP完全で、実用上は（計算量を減らすために）更なる制限が必要となる。
次回の勉強会では VSR の証明を行う。

1V は conflict serializability (CSR)、Multiversion (MV) は VSR が理論の中心となっている。

View を考える際には write したけど read されないものをどう扱うかも考える必要がある。

5.3.1 Multiversion View Serializability

when making serializability precise for multiversion schedules, we have to keep in mind that we are here taking about transparent versioning, that is, versioning that is not visible to the outside world, in particular to application programs or even the user. Thus, from a user's point of view a "correct" multiversion schedule should be equivalent to an ordinary serial schedule without versions. To make equivalence precise, conflict equivalence of the usual style (i.e., according to Chapter 3) is not an appropriate candidate

ユーザやアプリケーションに対しては version を見せずに "correct" multiversion schedule を作らなければならない。そうなると、conflict equivalence はマッチしない。
この例が EXAMPLE 5.3 で示されている。

EXAMPLE 5.3
Consider serial (monoversion) schedule s = w₀(x)c₀w₁(x)c₁r₂(x)w₂(y)c₂.
A multiversion schedule exhibiting the same "step syntax" is
m = w₀(x₀)c₀w₁(x₁)c₁r₂(x₀)w₂(y₂)c₂
Notice that while s shows three different conflicts (between w₀(x) and w₁(x), between w₀(x) and r₂(x), and between w₁(x) and r₂(x)), m exhibits only one conflict, the one between w₀(x₀) and r₂(x₀), since t₁ now writes a new version of x.
（後略）

1V では r₂(x₀) を実現できない。1V では直前の write を read することになるため。

commit を含めた schedule で conflict を議論することは正しいのか…？ここで commit を含めたのは r₂(x₀) を強調したかった意図が見える。だが、commit を抜きにした方が concurrent に見えるし、適切に思える。

m では w₀(x₀), r₂(x₀) だけが conflict となる。異なる version を read していることを conflict とは言わない。

conflict には2種類ある。
(1) r-w の conflict は operation レベル。
(2) Reads-From (RF) の conflict は transaction (TX) レベル。T₁ が T₂ の結果を読んでいる、ということ。
本では(1)(2)を一緒くたにしている。これが意図的なものかはわからないが…。

[memo]
勉強会では DEFINITION 3.3 Herbrand Semantics of Steps の簡単な解説がありました。Herbrand Semantics についてまとめた記事のリンクを貼っておきます。
エルブラン・セマンティクス(Herbrand Semantics)理解メモ - ぱと隊長日誌

Herbrand Semantics 以外の semantics があってもよいのではないか？
version を上手く導入したい。

DEFINITION 5.4 Reads-From Relation
Let m be a multiversion schedule, t_i, t_j ∈ trans(m).
The reads-from relation of m is defined by
RF(m) := {(t_i, x, t_j) | r_j(x_i) ∈ op(m)}

この定義では commit / abort について触れていない。

"(t_i, x, t_j)" はTX、"r_j(x_i) ∈ op(m)"は operation。
つまり、TXの関係を導くために operation を使っている。

TX の依存関係は Herbrand Semantics につながる。

物理的に read できる or できないことを RF とは言わない。

ここでは r_∞ を考えていない。
最終状態が何であれ、そこまでの RF が正しければ equivalent と言って良い。r_∞ が x₀ （初期状態）を read したってよい。r_∞ がそれ以降 overwrite されることはないため。

DEFINITION 5.5 View Equivalence
Let m and m' be two multiversion schedules such that trans(m') = trans(m).
m and m' are view equivalent, abbreviated m ≈v m', if RF(m) = RF(m').

"if" なので RF が等価なら View Equivalence ということ。

一旦 fix した TX の集合があるとする。ここに新たな TX 集合が来た時、新しい TX 集合も含めて reschedule するだろうか？そんなことをしたら、いつまでたっても状態が決まらない。
どこかで区切るという制約を考えればよいのでは？元の集合が新たな集合に対する初期値となる。

Linearizability とは TX が concurrent な間は入れ替え可能ということ。逆に concurrent な区間が終わると古い値は読めない。
[参考]
Linearizability については熊崎さんのスライドでも解説されています。

トランザクションをSerializableにする4つの方法 from Kumazaki Hiroki

トランザクション入門 from Kumazaki Hiroki

r₁(x₂)w₂(x₂)c₂c₁ を考える。
r₁(x₂)w₂(x₂) は物理的に成り立たない（write する前に read しているから）。
だが、commit order も含めた全体で考えれば、論理的には成り立つ。であれば、論理が成り立つように物理を入れ替えてやればよいのではないか？

A difference between this notion of view equivalence and the one introduced in Chapter 3 for monoversion schedules is that a consideration of final write operations is now irrelevant, for the simple reason that two multiversion schedules with identical sets of transactions always have the same final writes (since writes produce versions that are not erased anymore).

"final writes" が複数形であることからみて、過去の write は全て残っていて、read protocol でどの write を読むかが1つに決まるということではないか？

"since writes produce versions that are not erased anymore" は過去の write が全て残っているということ。

r(∞) は無視するという宣言であり、コミット後の version は複数あってよいということ。r(∞) は全ての write が read できて、read protocol 次第で read する値が1つに決まる。実際の read protocol を考えると、commit された latest な値となりがち。

[memo]
この箇所について神林さんと個別にお話をさせていただきました。
r(∞)で MV が収束しないとすると、以下の問題がありそうだと提起されていました。

どのタイミングで version を削れるかがわからない
意図的な epoch 跨ぎ（バッチ処理を想定）で生成された version と、今後利用される可能性が低い使用済みの version という異なる意味のものが混在してしまう

EXAMPLE 5.5
Let m = w₀(x₀)w₀(y₀)c₀r₁(x₀)r₁(y₀)w₁(x₁)w₁(y₁)c₁r₂(x₀)r₂(y₁)c₂.
If each r_i(x_j) occurring in m is replaced by an r_i(x), and correspondingly each w_i(x_i) by a w_i(x), the following monoversion schedule results:
s = w₀(x)w₀(y)c₀r₁(x)r₁(y)w₁(x)w₁(y)c₁r₂(x)r₂(y)c₂.
Both schedules are serial; however, in s, transaction t₂ reads data item x from t₁, while in m, it reads x from t₀.

m は serial order だが TX₂ が x₀, y₁ を read している。これは 1V である s と異なる結果になる。
このことから、View Equivalence は reads-from から判定する必要があるとわかる。

ここで挙げている例は説明として適切でない。read protocol の整合性が取れていない。これでは version function が乱数で決まっているようにすら見える…。もっと整合性の取れた例で議論したい。

This example-driven discussion leads directly to the following correctness criterion:

ここでの correctness は3章の correctness 関数と同じことを指している。

すなわち， correctness という関数を定義し，これについての定式化を行う．

この関数は，とりあえず「Concurrency Problemsが起きているか？」を検証するboolだと思えばよい．

correctness が真となる限りにおいて，その関数で定義した Concurrency Problems は起こらない．

これで，強い correctness を使えばあらゆる Concurrency Problems は起こらないし，弱い correctness がどのような Problems を含むのかも定義可能となる．

最も強い correctness が「直列実行したか？」である，と考えるとわかりやすいだろう．

TIS Chapter 3 - nikezono

DEFINITION 5.6 Multiversion View Serializability
Let m be a multiversion history. Then m is called multiversion view serializable if there exists a serial monoversion history m' for the same set of transactions such that m ≈v m'.
Let MVSR denote the class of all multiversion view-serializable histories.

ここまでは operation の話だったのに、ここで serial monoversion の話が出てくるのはなぜか？
serial multiversion history の世界を考えればもっと広がる可能性はあるが、それを考えきれないので 1V に絞ったのでは？

5.3.2 Testing Membership in MVSR

m = w₀(x₀)w₀(y₀)c₀r₁(x₀)w₂(x₂)w₂(y₂)c₂r₁(y₀)c₁

m は Inconsistent Read。TX₁ の間に TX₂ が入っている。

[memo]
この例は Inconsistent Read Anomaly というより Read Skew Anomaly のほうが適切に思えます。
いろんなAnomaly - Qiita

ですが本では

this schedule is the canonical example of an inconsistent read

としています。
また、c₂ の後に r₁(x) があれば Inconsistent Read になることを考えると、ここで厳密に区別する必要はないのかもしれません。

The previous theorem implies that the membership test for MVSR can intuitively not be easier than the one for VSR, since MVSR is a relaxation of conditions over VSR; in other words, additional restrictions (in VSR) over MVSR, we obtain an NP complete decision problem.

VSR でも難しいのに、MVSR が簡単なわけがない。

THEOREM 5.2 の Proof の読み解き方。
まず、problem が多項式時間で解けるため NP （NP complete とは異なることに注意）であるといっている。
そして、VSR が NP complete なのだから MVSR も NP complete であるといっている。
もし、MVSR が簡単に解けるのであれば、MVSR より制限された VSR も簡単に解けるはずである。だが、VSR が NP complete なのだからそれはありえない。よって、MVSR は NP complete といえる。

"3.7.2 On the Complexity of Testing View Serializability" で VSR が NP complete であることを証明している。

1V と MV における Conflict Graph と Serialization Graph の違いについて。

グラフは G=(V,E) で表され、V は頂点（vertices）の集合、E は頂点と頂点をつなぐエッジ（edges）の集合である。形式的には、グラフ G は順序対 G=(V,E) で定義され、V は有限の集合、E は V から選んだ2つの元からなる集合の集合である。
グラフ (データ構造) - Wikipedia

DEFINITION 3.15 Conflict Graph (Serialization Graph)
Let s be a schedule. The conflict graph, also known as the serialization graph, G(s) = (V, E) of s, is defined by
V = commit(s),
(t,t') ∈ E ⇔ t ≠ t' ∧ (∃p ∈ t) (∃q ∈ t') (p, q) ∈ conf(s)

1V では Conflict Graph と Serialization Graph が同じ。

DEFINITION 5.8 Multiversion Serialization Graph (MVSG)
For a given schedule m and a version order <<, the multiversion serialization graph MVSG(m, <<) of m then is the conflict graph G(m) = (V, E) with the following edges added for each r_k(x_j) and w_i(x_i) in CP(m), where k, i, and j are pairwise distinct: if x_i << x_j, then (t_i, t_j) ∈ E, otherwise (t_k, t_i) ∈ E.

MV では Conflict Graph と Serialization Graph を分けて考える必要がある。

Conflict Graph は operation レベルの話であり、Serialization Graph は TX レベルの話である。

1V では Conflict Graph がそのまま TX に map できるが、MV ではそれができない。

lifting という概念を fekete がSSI論文で言ってる

Operation による Graph を Lifting して Transaction の Graph を作っている

Transactional Information Systems Chapter 5読み会 - nikezono

該当論文を示します。
http://db.cs.berkeley.edu/cs286/papers/ssi-tods2005.pdf
Making Snapshot Isolation Serializable
3.1 Lifting Paths from SDG(A) to DSG(H)

参考

Transactional Information Systems Chapter 5読み会 - nikezono
中園さんの勉強会メモ。

2018-11-26

Transactional Information Systems 5章 MVCC勉強会　第一回　議論メモ

データベース勉強会/セミナー

勉強会について

Transactional Information Systems 5章 MVCC勉強会　第一回の議論メモです。

自分のメモをベースにまとめています。発言の聞き間違い、解釈違いの可能性があることをご了承ください。

特記の無い引用は本で議論した箇所を示しています。

Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery (The Morgan Kaufmann Series in Data Management Systems)

作者: Gerhard Weikum,Gottfried Vossen
出版社/メーカー: Morgan Kaufmann
発売日: 2001/05/24
メディア: ハードカバー
クリック: 2回
この商品を含むブログ (2件) を見る

[memo]は私のメモです。勉強会で議論された内容ではないことにご注意ください。
また、関連資料・引用・リンクを適宜挿入しています。

勉強会の目的・方針

write に強い DB を作りたい。現状は write が遅い。

Multi Version (MV) なら Write Lock をとらずに write できるはず。だが、現状の実装に Write Lock をとらないものはない。この状況を変えたい。

今回の勉強会は読むよりディスカッションを重視したい。

5.1 Goal and Overview

So far the underlying assumption has been that data items exist in exactly one copy each.

この本でここまでは1つのコピーが前提だった。

concurrent な間は MV だが未来永劫 MV なわけではない。commit したら Single Version (SV) になる。

read は read from、つまり何を読むかが大事。

It will turn out that versioning demands some reasonable extensions of serializability theory, and, as will be shown in Part Ⅲ, versioning is connected to recovery, as versions of modified data (before images or after images, or both) are sometimes maintained in some form by a recovery manager anyway.

recovery でも version の考え方を導入しないといけない。例えば、write の before / after を利用して recovery するのは実質 MV ともいえる。
だが、本来は recovery と Concurrency Control (CC) を分けて考えないといけない。この点を意識して本を読み進める必要がある。

昔（本が想定している時代）と今では commit の前提が変わっているのではないか？
昔の commit はどこまで recovery しなければならないかを示すものだった。
今は oneshot request / group commit が前提となっている。また、write の実行順序の制御に commit が利用されるようになってきていると思われる。

w₁w₂c₂c₁ というスケジュールを SV では実現できない。w₁ を w₂ で上書きした後、c₂ よりも後に c₁ が来ているからだ。だが、MV なら可能だ。

We will also look at the storage space needed to hold multiple versions.

Storage Space とはどこを指しているのか？
本の時代はディスクベースでメモリはバッファとして利用していた。
現在はメモリベースとなっている。
この前提で考えた時にどうか？

While theoretical considerations tend to assume space to be an unlimited resource, it turns out that the picture changes significanty should this not be valid anymore. In particular, we will show how multiversion serializability is affected by limiting the (total) number of versions around.

commit した後も version の保存は必要なのだろうか？recovery ができるのであれば、必要としているより古い version は捨ててよいのではないか？

MVは resource というよりデータ構造が議論の的になるのではないか？

lost update は view の anomaly だ。update の事実が lost しているわけではない。MV なら何も考えずに write を行うことができるので、writeを lost するということもない。

w₀(x)c₀r₁(x)r₂(x)w₁(x)w₂(x)c₁c₂
このスケジュールの問題点は t₁ も t₂ も update するつもりなのに、同じ item の同じ version を t₁ / t₂ が read していることではないか？

An important assumption we will make in this chapter is that versioning is transparent to the outside world. In other words, users or applications will not be aware of the fact that data items can appear in multiple versions.

ユーザーサイドに version を意識させることはない。でも、一部例外はある。
version のデータベースがある。MVCC とそれを混同してはならない。

[memo]
version のデータベースにはバージョン管理システム（Subversion, Git 等）のようなものも含むのか？

この点について記事を公開後にコメントをいただきました。version database は従来の relational database の機能に加え、version の概念も取り込んだ database とのことです。

One alternative is to use a source code version control system like git or svn to manage dataset versions. However, source code version control systems are both inefficient at storing unordered structured datasets, and do not support advanced querying capabilities, e.g., querying for versions that satisfy some predicate, performing joins across versions, or computing some aggregate statistics across versions [12].
（中略）
To answer this question we develop a system, ORPHEUSDB1 , to “bolt-on” versioning capabilities to a traditional relational database system that is unaware of the existence of versions. By doing so, we seamlessly leverage the analysis and querying capabilities that come “for free” with a database system, along with efficient versioning capabilities.
http://www.vldb.org/pvldb/vol10/p1130-huang.pdf

CASE (computer-aided software engineering) とはモデルを描くだけでコードが生成されるような古のツールのこと。
Computer Aided Software Engineering - Wikipedia

The discussion in this chapter continues to consider unaborted transactions only.

unabort を前提にするのはよくない（考慮すべき）。

ここでの abort は system abort ではなく user abort を指している。
serializable であるかの validation に失敗した時は system abort となる。

system commit とはDBが決める commit のことだ。この本の "commit" の前提でもある。
user commit とはユーザーが意思表示する commit のことだ。

[memo]
user commit はクライアント側が発行する commit のことであり、system commit は DB 側が CC を実現するために実行する commit のことと理解すればよいか？DB は CC のためにクライアント側の commit 要求の順序を変更することが可能なため。

これからの DB は 1 SQL 1 commit でよくないか？つまり、インタラクティブな操作を認めない。AutoCommit のイメージ。そうでないと速くすることはできない。

5.2 Multiversion Schedules

EXAMPLE 5.1
※一部を抜粋
s = r₁(x)w₁(x)r₂(x)w₂(y)r₁(y)w₁(z)c₁c₂
s ∉ CSR
r₁(y) is "too late" for making the schedule that already has conflict (on x) between t₁ and t₂ an acceptable one.
Now suppose that the old value of y (in this the initial value of y) were still available when r₁(y) arrived.
s' = r₁(x)w₁(x)r₁(y)r₂(x)w₂(y)w₁(z)c₁c₂
s' ∈ CSR

s の r₁(y) を前に持っていくだけでCSRになるのであれば、何とか実現する方法があるのではないか？

read で何を読むかを決めなければいけない。

先に進むには Conflict Serializable (CSR) を理解しておく必要がある。
CSR の conflict の定義とは r-w, w-r, w-w であり、つまり2つの transaction (TX) が同じ item を操作する際に一方が write であるということだ。

データ処理の競合関係「だけ」に注目した解釈を導入します。この競合(conflict)を特にデータの書き込みについてのみ注目します。すなわち
w-w　あるTX書き込んで、別のTXがまた書き込む
w-r　あるTXが書き込んだものを、別のTXが読み込む
r-w　あるTXが読み込んだものを、別のTXが書き込む
この競合関係が維持されていれば、同一のスケジュールとみなすという「解釈」です。まあ、ぶっちゃけr-rの関係は順序は関係ないでしょ？という発想ですね。すなわち、書き込みの競合関係が同一である集合に属していれば、そのスケジュールは、同じコンテクストであると意味です。この集合をconflict serializableといいます。
Welcome back to the TRANSACTION! - 急がば回れ、選ぶなら近道

SV の conflict をイメージしすぎると MV で混乱することになる。

実装はCSRでもMVSRでも両者ともどもエッジグラフでのvalidationになるので、結果としては「同じ」read-setとwrite-setのintersectionの話になっているところがややこしい。やっていることは似ているが、導かれているクライテリアは異なるということだ。
Multi-version Conflict Serializability - 急がば回れ、選ぶなら近道

本の P99 に CSR で許される並び替えルール (C1 - C4) が記述されている。

Our presentation initially considers only totally ordered schedules since these are closer to the "algebraic" nature of the following commutativity rules for page model data operations. In these rules "～" means that the ordered pair of actions on the left-hand side can be replaced by the right-hand side, and vice versa.
Rule C1: r_i(x)r_j(y) ～ r_j(y)r_i(x) if i ≠ j
Rule C2: r_i(x)w_j(y) ～ w_j(y)r_i(x) if i ≠ j, x ≠ y
Rule C3: w_i(x)w_j(y) ～ w_j(y)w_i(x) if i ≠ j, x ≠ y

This additional rule simply states that two unordered operations can be arbitrarily ordered if they are nonconflicting. We refer to this rule as the ordering rule:
Rule C4:
o_i(x), p_j(y) unordered
↝ o_i(x)p_j(y) if x ≠ y ∨ (o = r ∧ p = r)

View Serializable (VSR) は Herbrand Semantics で表現できる。
Serializablitiyとは？ 2. View Serializabilityについて - Qiita

read-from が変わらないことは CSR であることの一つの条件である。
C1 - C4 のルールを破ると read-from が壊れてしまう。正確には壊す可能性があるということであり、厳密にはr_∞を考える必要がある。

Basically, a read step of the form r(x) reads an (existing) version of x, and a write step of the form w(x) (always) creates a new version of x (or overwrites an existing one).

r(x) の existing を最新の version の事とは断言していない。複数 version が存在する可能性を残している。
例えば group commit で w-r-w を考えると、read は通常であれば前の write をreadするが、後の write を read することにしても良い。
PostgreSQL には DEFERRABLE READ ONLY があるが、これは後の write を読むのと同じことを実現しているように思える。

この本での history とは逐次実行の事である。verify で並べ替えるのはまた別の話となる。

MV では read 発行時に version が決まることはない。取りうるスケジュールは複数ある。Rule C4 は半順序から全順序を生成するルールとなる。

C4だけが特殊なルールとして存在する．

C1,C2,C3は暗黙的にスケジュール/ヒストリの命令列に全順序があることを要求している．

しかしConcurrency Controlではタプルごとの命令の半順序しか得られないことが多い．

半順序から全順序を生成するルールが以下のC4.

Reducibility - nikezono

DEFINITION 5.1 Version Function
Let s be a history with initialization transaction t₀ and final transaction t_∞. A version function for s is a function h, which associates with each read step of s a previous write step on the same data item, and which is the identity on write steps.

read version の決め方にはいくつもある。例えば、
・自分の write
・直前の write
・commit 済みの write
のようにいくつも取りうる。

Commit, Install, TX の order はそれぞれ異なる。

DEFINITION 5.1 で h は Herbrand をイメージしているのではないか？history に h ではなく s を割り当てていることからもうかがえる。

the result of applying a version function to a read operation could be any of (1) the associated write operation, (2) the version assigned for reading, or (3) the "versioned" read operation. Since all three possibilities are basically equivalent, this should not cause any confusion.

(1)～(3) が equivalent とは？なぜあえて説明しているのか？そこにどんな意図があるのか？
(2) は read そのものが version を指定する場合。
(3) は TX そのものの version を read の version に適用する場合。

w₁w₂c₁
latest なものを read するのであれば w₂ の version となる。
だが、commit したものを read するのであれば w₁ のversion となる。

本で想定している実装は貧弱。なので、全体的に実装を避けて書いているようにも見える。

参考

Transactional Information Systems 5章 MVCC勉強会第一回に参加 - Yabu.log
yuYabu さんが勉強会のメモを早々にまとめられていました。仕事が早い！

Transactional Information Systems Chapter 5読み会 - nikezono
中園さんの勉強会メモ。

デヴすぴスラ (@dev_supisula) | Twitter
デヴすぴスラさんが Twitter で勉強会の内容を踏まえた考察を重ねられています。

更新情報

2018/11/29

version database について追記しました。
中園さんの勉強会メモのリンクを追記しました。

2018-11-20

PostgreSQL の実行計画に現れる One-Time Filter の読み解き方

データベース

テーマ

PostgreSQLの実行計画に現れる One-Time Filter とは、クエリ―実行時に1回だけ評価すればよいフィルターのことです。

例えば、次のようなクエリーで現れます。

SELECT * FROM pg_type WHERE CURRENT_DATE = '2018-12-01'::date;

このWHERE句はクエリー実行時に1度だけ評価すれば決まります。このような場合、実行計画に One-Time Filter として出力されます。

この One-Time Filter について、以下の観点で調べてみました。

Filter と One-Time Filter の違い
One-Time Filter の出現パターン

検証は PostgreSQL 11.0 で行いました。

なお、PostgreSQLのバージョンによって実行計画の出力結果に若干の違いがあるようですので、ご注意ください。

Filter と One-Time Filter の違い

実際の実行計画で確認します。

(1) Filter のみ
=# EXPLAIN ANALYZE SELECT * FROM pg_type WHERE typtype = 'b';
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Seq Scan on pg_type  (cost=0.00..13.75 rows=142 width=252) (actual time=0.013..0.131 rows=143 loops=1)
   Filter: (typtype = 'b'::"char")
   Rows Removed by Filter: 239

Filter はそのノードで取得した各行に対して評価を行います。(1)だと pg_type の382行のうち、239行がフィルタによって削除され、143行が残ったとわかります。

(2) One-Time Filter のみ
=# EXPLAIN ANALYZE SELECT * FROM pg_type WHERE CURRENT_DATE = '2018-12-01'::date;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Result  (cost=0.01..12.80 rows=380 width=252) (actual time=0.004..0.004 rows=0 loops=1)
   One-Time Filter: (CURRENT_DATE = '2018-12-01'::date)
   ->  Seq Scan on pg_type  (cost=0.01..12.80 rows=380 width=252) (never executed)

One-Time Filter は1回限りの評価です。(2)では One-Time Filter が先に評価され、WHERE句を満たさないことが分かったため、pg_type のスキャンは実行されませんでした。

(3) Filter と One-Time Filter
=# EXPLAIN ANALYZE SELECT * FROM pg_type WHERE typtype = 'b' AND CURRENT_DATE = '2018-12-01'::date;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Result  (cost=0.01..13.75 rows=142 width=252) (actual time=0.004..0.004 rows=0 loops=1)
   One-Time Filter: (CURRENT_DATE = '2018-12-01'::date)
   ->  Seq Scan on pg_type  (cost=0.01..13.75 rows=142 width=252) (never executed)
         Filter: (typtype = 'b'::"char")

WHERE句に Filter と One-Time Filter となる条件を組み合わせた場合が(3)になります。Filter は Seq Scan ノードで、One-Time Filter は Result ノードで評価されています。この場合も(2)と同様に One-Time Filter が先に評価され、Seq Scan は実行されませんでした。

One-Time Filter の出現パターン

調査の過程で見つかった主なパターンをご紹介します。他にもあればぜひご連絡ください。

PostgreSQLプランナはプランニングの段階で評価結果の true / false が明確な場合、それに応じたプランを生成しているようです。

(4)
=# EXPLAIN SELECT * FROM pg_type WHERE true;
                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on pg_type  (cost=0.00..12.80 rows=380 width=252)

(5)
=# EXPLAIN SELECT * FROM pg_type WHERE 'a' IN ('a', 'b', 'c');
                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on pg_type  (cost=0.00..12.80 rows=380 width=252)

(4)のようにWHERE句が true であることが明確な場合、One-Time Filter が出力されません。これは(5)のようにプランニングの段階で評価可能な場合も含まれています。

(6)
=# EXPLAIN SELECT * FROM pg_type WHERE false;
                 QUERY PLAN
--------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=235)
   One-Time Filter: false

(7)
=# EXPLAIN SELECT * FROM pg_type WHERE 'd' IN ('a', 'b', 'c');
                 QUERY PLAN
--------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=235)
   One-Time Filter: false

(6)のようにWHERE句が false であることが明確な場合、One-Time Filter は出力されますが、スキャン（例：Seq Scan）は出力されません。プランニングの段階で評価可能な(7)でも同様です。

(8)
=# EXPLAIN SELECT * FROM pg_type WHERE CURRENT_DATE = '2018-12-01'::date;
                            QUERY PLAN
------------------------------------------------------------------
 Result  (cost=0.01..12.80 rows=380 width=252)
   One-Time Filter: (CURRENT_DATE = '2018-12-01'::date)
   ->  Seq Scan on pg_type  (cost=0.01..12.80 rows=380 width=252)

(8)の場合はプランニングの段階でWHERE句を評価できないため、そのまま One-Time Filter に出力されています。実行する際に One-Time Filter を評価し、条件を満たさなければ下位ノードの Seq Scan は実行されません。

(9)
=# EXPLAIN SELECT * FROM pg_type WHERE typtype = 'b' AND EXISTS (SELECT 1 WHERE 1+1=3);
                            QUERY PLAN
------------------------------------------------------------------
 Result  (cost=0.01..13.76 rows=142 width=252)
   One-Time Filter: $0
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.01 rows=1 width=0)
           One-Time Filter: false
   ->  Seq Scan on pg_type  (cost=0.01..13.76 rows=142 width=252)
         Filter: (typtype = 'b'::"char")

(9)の場合、WHERE句が false となることは明白ですが、プランニングの段階ではそこまで評価せず、そのままプランに出力されています。

EXISTS句が "One-Time Filter: $0" として出力されています。これは EXISTS句内のクエリがその他のクエリの結果に依存しておらず、単独で実行可能かつ1回限りの評価でよいためと思われます。

EXISTS句内の "SELECT 1 WHERE 1+1=3" は "InitPlan 1 (returns $0)" として出力されています。このSELECT結果が $0 に代入され、"One-Time Filter: $0" で評価されます。

このケースで Seq Scan は実行されません。

参考資料

PostgreSQL プラン・ツリーの概要
Resultノードの説明の中で One-Time Filter について少し触れられています。