Lee's Stego Research Notes: March 2008

Sunday, March 30, 2008

Defending Against Statistical Steganalysis (part 1)

N. Provos 在 10th USENIX Security Symposium, August 13-17, 2001 發表了 "Defending Against Statistical Steganalysis" 這篇論文, 內容就是闡述 OutGuess 0.2 這個隱藏軟體是如何運作的。

本篇文章所要討論的主軸是論文中有關 OutGuess 核心技術的部分 - Section 3 。

Section 3 Embedding Process

作者將 embedding Process 切割成兩個獨立的步驟:

1. Identification of redundant bits.
Redundant bits can be modified without detectably degrading the cover medium.
作者指出所謂的冗餘位元(redundant bits) 就是經過修改也不會在掩護媒體中產生會被偵測出來的品質下降現象(degrading)。

2. The selection of bits in which the hidden information should be placed.

切割成兩個步驟的好處是容易取代(easy replacement), 如果要將本篇論文提出的方法在別的資料格式中實作出來, 只要將 identification algorithm 換掉, 然後用新的選擇策略(selection strategy)即可。

Section 3.1 Identification of Redundant Bits

作者闡述了一個觀念, 用來嵌入機密訊息的冗餘位元通常和影像的儲存格式相關。整個嵌入程序自然也和輸出格式有關。通常壓縮程序也包含其中。要最小化對掩護媒體(cover-medium)的修改(modification), 必須具備有關冗餘位元的相關知識才做得到, 作者提到 OutGuess 實作了整個輸出影像的運算。

For example, the OutGuess system performs all operations involved in created the output object and saves the redundant bits encountered. For the JPEG image format, this might be the LSB of the discrete cosine transform coefficients.

Section 3.2 Selection of Bits

探討如何從影像的 redundant bits 中選取一些 bits 來嵌入機密訊息。OutGuess 是使用 RC4 串流加密器(stream cipher)對機密訊息加密, 同時也用 RC4 來建立一個 PRNG (pseudo-random number generator), 然後再將選定的 seed 餵進這個 PRNG 來選擇冗餘位元。

32 state bits = 16-bit seed + 16 bit integer
16-bit seed: 由於不同的 seeds 會選取不同的冗餘位元來作為嵌入機密訊息之用, 因此, 不同的 seeds 自然對原始影像造成的 change, 也會有所不同。當接收端(receiver)收到偽裝影像(stego-image)後, 必須知道當初所選定的 seed, 因此必須把這16-bit seed 也嵌入到掩護影像(cover-image) 之中。
16-bit integer: containing the length of the hidden message.

冗餘位元的選取方式是利用上述的 PRNG 來計算下一個 bit 的隨機距離(random offset) R i(x),

　b0 = 0,
　bi = bi-1 + Ri(x) 　for i = 1, 2, ... , n

bi 表示第 i 個選取位元的位置, Ri(x) 表示與上個選取位元之間的隨機距離, 值介於 [1, x] 之間。x 為最大的間隔(interval), 這個值在每嵌入 8 個位元, 就會重新使用下列的公式重新計算, 目的就是讓所有的機密訊息可以分布到整個可以使用的位元中。

　interval = 2 * remaining redundant bits / remaining length of message.

用上述的方法來設定 interval, 會使得機密訊息的長度限制在 50% 嵌入空間之內。

Section 3.3 Beneficial Reseeding of the PRNG

談論如何靠著選擇不同的 seeds, 智慧地選擇不同的嵌入位置的子集合, 不但可以讓 changed bits 的總數降低, 而且使得嵌入行為較不容易被偵測出來 (Detectability is also used as a bios in the selection process.)。

由於掩護影像(cover-image)中的冗餘位元, 不是 1 就是 0, 加上要嵌入的資料先用 RC4 stream cipher 加密, 變成一串二元的隨機資料流(binary random stream), 將機密訊息嵌入到冗餘位元, 造成這些冗餘位元被改變的機率期望值為 0.5。因此, 統計學中的二元分布(binomial distribution)正好可以用來描述一般的 LSB 嵌入行為。

假設, 我們從冗餘位元之中, 將一個 seed 餵進 PRNG 選擇了 4430 個位元, 並將同樣長度的機密訊息嵌入其中, 便可以去計算此次嵌入動作一共改變了多少個 redundant bits。注意: 不同的 seed 餵進同一個 PRNG 將使得所選擇的嵌入位置不同。Figure 1 就是重複使用不同的 seeds 來統計這 4430 個redundant bits 被改變的總數, 累計其統計值所畫出來的結果。

Figure 1: Probability distribution of changed bits for different seeds compared to a binomial distribution with n=4430 and p=0.5.

不管是從 binomial distribution 公式推論, 或是從 Figure 1 中的實驗中, 我們都可以觀察到當我們選定一個 seed 時, changed bits 的個數是以 n/2 = 2215 的可能性(機率)最高, 不過, 還是存在一些 seeds 會使得 changed bits 的個數小於 2150。論文中是這樣討論的:

Picking a seed that represents the changed bits at the lower end of the binomial distribution allows us to reduce the number of bits that have to be changed; see Figure 1. It becomes harder to detect the modifications, as more of the hidden message is already naturally represented in the redundant bits.

除了降低修改之外, 可偵測性(detectability)也是 selection process 要考量的一個因素。

Detectability is also used as a bias in the selection process. The selector does not try to reduce only the number of changed bits but also the overall detectability. Whenever a bit has to be modified, its detectability will be added to a global bias. A higher accumulated bias reduces the likelihood that this specific embedding will be used.

Section 3.4 Choices with Coding Theory

作者在這邊提到 Coding Theory 的考量為使用 PRNG 去選擇冗餘位元就無可避免地選到
1. locked bits
2. bits with a high detectability
上述兩類冗餘位元是作者不想去更改的。因此, 作者想使用錯誤更正碼(error-correcting codes)來解決上述問題。

[n, k, d] coding 指的是長度為 k 位元的機密訊息(k-bit data block), 將被編碼成長度為 n 位元的編碼區塊(n-bit code block), 每個 code 之間的 Hamming distance 至少是 d, 假設 d = 2t + 1, 那麼這個編碼就具備了可以更正 t 個錯誤位元的能力。換句話說, n 個位元的編碼區塊之中, 如果發生 t 個位元的錯誤, 那麼使用解碼程序, 就可以偵測出哪 t 個位元發生錯誤, 進而更正回來, 因此原先的 k 位元的資料, 是可以完全解碼出來的。

將機密訊息用錯誤更正碼來編碼, 無疑也會增加要嵌入的長度。然而, 觀察整個嵌入過程獲知:
1. 有一半的資訊嵌入是不會改變到冗餘位元的 ( n / 2),
2. 可以有 t 個位元可以不用嵌入(更改冗餘位元)
因此, 假如
　( n /2 ) - t = ( k / 2 )
成立, 那麼作者希望嵌入 n 位元的編碼區塊需要修改的位元數(上述式子的等號左邊)要和嵌入未經編碼的 k 位元的資料區塊需要更改的位元數(上述式子的等號右邊)相等。將上述式子通分得到 n -2t = k, 並將 d+1 = 2t 帶入可以得到
　d = n - k + 1,
剛好就是 MDS (maximum distance separable) code 的 Singleton bound。因此, 作者在這邊得到一個結論就是只要選擇 MDS codes, 就可以滿足上述作者期望的。

不幸地, 值得一提的(non-trivial)二元 MDS code 就只有重複碼(repetition code), 主要缺點就是編碼必須將資料重複 n 次, 因此, 重複碼僅使用在隱藏訊息很小的情況。

資料經過編碼後, 每個編碼區塊可以選擇 t 個位元不去修改冗餘位元。作者所使用的選擇策略是:
欲嵌入的位元與冗餘位元不同(conflict), 且冗餘位元先前已經被嵌入過資料, 被 locked bits 鎖住。

OutGuess 所使用的技術與 Ross J. Anderson and Fabien A. P. Petitcolas 發表在 Journal on Selected Areas in Communication, 16(4): 474-481, May, 1998 的論文 On the Limits of Steganography 中所建議的 parity encoding 相似。然而, 使用 error-correcting codes 的好處要比使用 parity encoding 多。透過選擇一種不是 MDS 的 code, 我們可以犧牲些許的嵌入容量(capacity) 而得到更高的安全性(security)。除此, 對照使用 parity encoding 必須 lock 住 n 個位元, 使用 error-correcting codes 則僅僅需要 lock 住 n-t 個位元。

Section 3.5 Plausible Deniability

為了嵌入機密訊息, 我們修改掩護媒體中的冗餘位元。這些冗餘位元可能存在一些我們沒有感知, 或是對手比我們了解的自然統計特質。假如嵌入程序改變了上述特質, 在這方面知識淵博的觀察者, 不用指出哪些特定位元被改變, 就可以推論出隱藏訊息是存在的。

偽裝媒體的創造者必須面對的是: 欲隱藏的通訊行為可能被揭露出來。然而, 我們假設觀察者僅僅可以確定的事實是掩護媒體被更改了。假如傳訊者嵌入多重訊息, 其中可以包含一份無害的訊息, 讓它和真正想要傳送的訊息(request)攪在一起, 然後宣稱沒有任何訊息隱藏在偽裝媒體之中, 偽裝媒體並沒有遭受破壞(沒有遭到修改, 換句話說就像原始掩護影像一樣, 沒有破壞原先存在的特質)。這就是所謂的似乎合理的可否認性(plausible deniability)。

　　

實際上, 整個 Section 3 所描述的技術已經隱含地支援上述所提到的似乎合理的可否認性。可以隱藏不只一份的訊息, 使用 locked bits 來避免先嵌入的訊息被後嵌入的訊息覆蓋掉。即使是與嵌入訊息的大小相關, 不與先前 locked 住的冗餘位元重疊的可能性是很小的, 在這種情況下, 使用 error-correcting codes 則是可以增加選擇的彈性。

Section 3.6 Hidden Message Determines Cover

針對特定的隱藏訊息, 可以在不同的掩護媒體中, 選擇一個機密訊息對掩護媒體本身影響較小(with minimal modification)的掩護媒體, 來嵌入機密訊息。這和 Section 3.3 中有關 binomial distribution 的系列討論是差不多的。

Go to: Defending Against Statistical Steganalysis (part 2)

Go to: Defending Against Statistical Steganalysis (part 3)

Niels Provos, "Defending Against Statistical Steganalysis,"10th USENIX Security Symposium, August 13-17, 2001.
　

Wednesday, March 19, 2008

The Difference between Conference and Workshop

昨天上課在討論時, 研究生問我 conference 和 workshop 有什麼不同? 我記得前幾個禮拜曾經在空中英語教室 chatroom 聽過這麼一段討論, 於是就把檔案找出來一起聽聽看。剛剛想到和黃世育老師規劃非資訊學院科系大二多媒體課程時, 有納入音訊檔的簡單處理, 黃老師建議使用 GoldWave , 於是就上網搜尋這個軟體下載, 想要把 chatroom 這一段單獨剪輯成一個檔案, 放在這邊給研究生分享。

Sunday, March 16, 2008

IHW 2008: Information Hiding Workshop 2008

Information Hiding Workshop 2008
Santa Barbara, California, USA,
May 19-21, 2008

For many years, information hiding has captured the imagination of researchers. Digital watermarking and steganography tools are used to address digital rights management, protect information, and conceal secrets. From an investigators perspective, information hiding provides an interesting challenge for digital forensic investigations and steganalysis techniques allows hidden information to be discovered. These are but a small number of related topics and issues. Current research themes include:

　　* Watermarking (attacks, security, algorithms)
　　* Anonymity and privacy
　　* Steganography and steganalysis
　　* Multimedia security
　　* Other hiding domains (e.g. text, software, etc.)
　　* Information assurance
　　* Digital forensics
　　* Covert/subliminal channels
　　* Theoretical aspects of information hiding
　　* Intrusion detection
　　* Digital rights management
　　* Novel technologies/applications

Continuing a successful series that brought together these closely linked research areas, the 10th Edition of Information Hiding (IH08) will be held in Santa Barbara, California.
Call for papers

Saturday, March 15, 2008

EDAS Conference Management System

要投稿到 ISC 2008, 必須透過 EDAS Conference Management System, 換句話說, 我們必須要先到 EDAS 註冊, 取得帳號後, 才能投稿論文。這兩天, 使用 EDAS 的感覺還不錯, 投稿時, 要求論文初稿上不能有作者的相關資料, 以維護審稿的公正性, 因此每一個作者也都必須要有 EDAS 帳號, 然後再用新增作者的功能加上去。

有許多研討會都是透過 EDAS 來投稿論文, 我上星期剛接觸到 EDAS 時, 在 EDAS 的網站看到許多可以投稿的研討會相關資訊, 那時候我就覺得透過這個系統中 Submit paper 功能, 看到原本分散在各處的研討會訊息被整合起來, 對每年都要參加研討會的學者來說, 應該是很不錯的管道。

Tuesday, March 11, 2008

Message from ISC 2008

今天中午收到來自 ISC 2008 的 mail, 說明目前投稿狀況, 台灣只有 8 篇, 真的是有點少, 希望大家多多投稿到 ISC 2008,...

收到的 mail 如下:

各位好,
TWISC 與資訊安全學會將於九月主辦第十一屆 Information Security Conference 國際會議，李德財所長擔任會議之 General Chair，論文集列入 LNCS，目前投稿情形有五十六篇，其中就日本投了超過十七篇，我們主辦國卻只有八篇投稿，截稿日期到 3/15 日截止，會議網站請參考：http://isc08.twisc.org/ 若各位目前手頭有稿件，祈請踴躍投稿。

PS：依學術慣例，LNCS 接受之論文，若有進行 Major revision，也可再轉投 journal。倘若論文（初稿）沒有被接受，Reviewer 的意見，也將對未來論文修改有許多助益。

吳宗成敬上

Thursday, March 06, 2008

ISC08: Information Security Conference 2008

September 15-18, 2008
Taipei, Taiwan
Website http://isc08.twisc.org/index.html

Information Security 每年一度的盛事, 今年剛好在台灣舉辦, 研討會的 Topics 包含了我們的研究領域 - Information Hiding, ( 另一個重要的國際研討會是 IHW, Information Hiding Workshop, 今年 May 19-21 在 Santa Barbara, California, USA 舉辦, 論文截稿日期是 February 2, 2008 早就已經來不及了 )。

今年 ISC 原本的 Submission Deadline 是 March 1, 2008, 我的論文原先的規劃是準備投稿到屬於 WCE 2008 的 ICISOIE 08 , 昨天其實已經將論文改成 WCE 2008 的六頁格式, 準備今天投稿出去。早上開車到學校的路上, 就一直在考慮改投到 ISC 08, 在車上和黃世育老師討論後, 決定將我們一起合作寫的兩篇論文分散到兩個研討會, 黃老師寫的 CAPTCHA 那篇, 照原定計畫投稿的 WCE 2008, 我寫的這篇 Modified LSB Embedding 改投到 ISC 08, 兩個研討會都要參加。

昨天花了一整天, 將論文調整成 WCE 2008 的格式, 決定改投之後, 今天又要把論文格式重新改成 ISC 08 的格式了。

Call for papers

Important Dates
Conference Dates: September 16-18, 2008
Submission Deadline: March 15, 2008, 11pm GMT (firm deadline)
Notification of Acceptance: May 20, 2008
Camera-ready Copies Deadline: June 15, 2008

Topics of Interest
ISC aims to attract high quality papers in all technical aspects of information security.
Topics of interest include, but are not limited to, the following:

　* Access Control
　* Accounting and Audit
　* Anonymity and Pseudonymity
　* Applied Cryptography
　* Attacks and Prevention of Online Fraud
　* Authentication and Non-repudiation
　* Biometrics
　* Cryptographic Protocols and Functions
　* Database and System Security
　* Design and Analysis of Cryptographic Algorithms
　* Digital Rights Management
　* Economics of Security and Privacy
　* Formal Methods in Security
　* Foundations of Computer Security
　* Identity and Trust Management
　* Information Hiding and Watermarking
　* Infrastructure Security
　* Intrusion Detection, Tolerance and Prevention
　* Mobile, Ad Hoc and Sensor Network Security
　* Network and Wireless Network Security
　* Peer-to-Peer Network Security
　* PKI and PMI
　* Private Searches
　* Security and Privacy in Pervasive/Ubiquitous Computing
　* Security in Information Flow
　* Security for Mobile Code
　* Security of Grid Computing
　* Security of eCommerce, eBusiness and eGovernment
　* Security Modeling and Architectures
　* Security Models for Ambient Intelligence environments
　* Trusted Computing
　* Usable Security