Lee's Stego Research Notes: December 2008

Thursday, December 18, 2008

News: Terrorist instructions hidden online

2001/02/05 Jack Kelley 接連在《 USA Today 》發表兩篇報導, 指出美國官方及專家認為賓拉登及其組織可能透過將恐怖攻擊計畫藏在色情網站的圖片之中，或是運動聊天室來傳遞攻擊指令。

原始新聞連結 / Original Link

Terror groups hide behind Web encryption

02/05/2001 - Updated 05:17 PM ET
By Jack Kelley

WASHINGTON — Hidden in the X-rated pictures on several pornographic Web sites and the posted comments on sports chat rooms may lie the encrypted blueprints of the next terrorist attack against the United States or its allies. It sounds farfetched, but U.S. officials and experts say it's the latest method of communication being used by Osama bin Laden and his associates to outfox law enforcement. Bin Laden, indicted in the bombing in 1998 of two U.S. embassies in East Africa, and others are hiding maps and photographs of terrorist targets and posting instructions for terrorist activities on sports chat rooms, pornographic bulletin boards and other Web sites, U.S. and foreign officials say.

"Uncrackable encryption is allowing terrorists — Hamas, Hezbollah, al-Qaida and others — to communicate about their criminal intentions without fear of outside intrusion," FBI Director Louis Freeh said last March during closed-door testimony on terrorism before a Senate panel. "They're thwarting the efforts of law enforcement to detect, prevent and investigate illegal activities."

A terrorist's tool

Once the exclusive domain of the National Security Agency, the super-secret U.S. agency responsible for developing and cracking electronic codes, encryption has become the everyday tool of Muslim extremists in Afghanistan, Albania, Britain, Kashmir, Kosovo, the Philippines, Syria, the USA, the West Bank and Gaza and Yemen, U.S. officials say.

It's become so fundamental to the operations of these groups that bin Laden and other Muslim extremists are teaching it at their camps in Afghanistan and Sudan, they add.

"There is a tendency out there to envision a stereotypical Muslim fighter standing with an AK-47 in barren Afghanistan," says Ben Venzke, director of special intelligence projects for iDEFENSE, a cyberintelligence and risk management company based in Fairfax, Va.

"But Hamas, Hezbollah and bin Laden's groups have very sophisticated, well-educated people. Their technical equipment is good, and they have the bright, young minds to operate them," he said.

U.S. officials say bin Laden's organization, al-Qaida, uses money from Muslim sympathizers to purchase computers from stores or by mail. Bin Laden's followers download easy-to-use encryption programs from the Web, officials say, and have used the programs to help plan or carry out three of their most recent plots:

　　* Wadih El Hage, one of the suspects in the 1998 bombing of two U.S. embassies in East Africa, sent encrypted e-mails under various names, including "Norman" and "Abdus Sabbur," to "associates in al Qaida," according to the Oct. 25, 1998, U.S. indictment against him. Hage went on trial Monday in federal court in New York.

　　* Khalil Deek, an alleged terrorist arrested in Pakistan in 1999, used encrypted computer files to plot bombings in Jordan at the turn of the millennium, U.S. officials say. Authorities found Deek's computer at his Peshawar, Pakistan, home and flew it to the National Security Agency in Fort Meade, Md. Mathematicians, using supercomputers, decoded the files, enabling the FBI to foil the plot.

　　* Ramzi Yousef, the convicted mastermind of the World Trade Center bombing in 1993, used encrypted files to hide details of a plot to destroy 11 U.S. airliners. Philippines officials found the computer in Yousef's Manila apartment in 1995. U.S. officials broke the encryption and foiled the plot. Two of the files, FBI officials say, took more than a year to decrypt.

"All the Islamists and terrorist groups are now using the Internet to spread their messages," says Reuven Paz, academic director of the Institute for Counter-Terrorism, an independent Israeli think tank.

Messages in dots

U.S. officials and militant Muslim groups say terrorists began using encryption — which scrambles data and then hides the data in existing images — about five years ago.

But the groups recently increased its use after U.S. law enforcement authorities revealed they were tapping bin Laden's satellite telephone calls from his base in Afghanistan and tracking his activities.

"It's brilliant," says Ahmed Jabril, spokesman for the militant group Hezbollah in London. "Now it's possible to send a verse from the Koran, an appeal for charity and even a call for jihad and know it will not be seen by anyone hostile to our faith, like the Americans."

Extremist groups are not only using encryption to disguise their e-mails but their voices, too, Attorney General Janet Reno told a presidential panel on terrorism last year, headed by former CIA director John Deutsch. Encryption programs also can scramble telephone conversations when the phones are plugged into a computer.

"In the future, we may tap a conversation in which the terrorist discusses the location of a bomb soon to go off, but we will be unable to prevent the terrorist act when we cannot understand the conversation," Reno said.

Here's how it works: Each image, whether a picture or a map, is created by a series of dots. Inside the dots are a string of letters and numbers that computers read to create the image. A coded message or another image can be hidden in those letters and numbers.

They're hidden using free encryption Internet programs set up by privacy advocacy groups. The programs scramble the messages or pictures into existing images. The images can only be unlocked using a "private key," or code, selected by the recipient, experts add. Otherwise, they're impossible to see or read.

"You very well could have a photograph and image with the time and information of an attack sitting on your computer, and you would never know it," Venzke says. "It will look no different than a photograph exchanged between two friends or family members."

U.S. officials concede it's difficult to intercept, let alone find, encrypted messages and images on the Internet's estimated 28 billion images and 2 billion Web sites.

Even if they find it, the encrypted message or image is impossible to read without cracking the encryption's code. A senior Defense Department mathematician says cracking a code often requires lots of time and the use of a government supercomputer.

It's no wonder the FBI wants all encryption programs to file what amounts to a "master key" with a federal authority that would allow them, with a judge's permission, to decrypt a code in a case of national security. But civil liberties groups, which offer encryption programs on the Web to further privacy, have vowed to fight it.

Officials say the Internet has become the modern version of the "dead drop," a slang term describing the location where Cold War-era spies left maps, pictures and other information.

But unlike the "dead drop," the Internet, U.S. officials say, is proving to be a much more secure way to conduct clandestine warfare.

"Who ever thought that sending encrypted streams of data across the Internet could produce a map on the other end saying 'this is where your target is' or 'here's how to kill them'?" says Paul Beaver, spokesman for Jane's Defense Weekly in London, which reports on defense and cyberterrorism issues. "And who ever thought it could be done with near perfect security? The Internet has proven to be a boon for terrorists."

原始新聞連結 / Original Link

Terrorist instructions hidden online

02/05/2001 - Updated 05:22 PM ET
By Jack Kelley

WASHINGTON — Osama bin Laden and other Muslim extremists are posting encrypted, or scrambled, photographs and messages on popular Web sites and using them to plan terrorist activities against the United States and its allies, U.S. officials say. The officials say bin Laden and his associates are using the Internet to conduct what some are calling "e-jihad," or holy war. Bin Laden, a dissident Saudi businessman, has been indicted for the 1998 bombing of two U.S. embassies in East Africa and is believed to be responsible for last fall's bombing of the USS Cole in Yemen. Four alleged bin Laden associates went on trial Monday in federal court in New York for the embassy bombings. "To a greater and greater degree, terrorist groups, including Hezbollah, Hamas, and bin Laden's al Qaida group, are using computerized files, e-mail, and encryption to support their operations," CIA Director George Tenet wrote last March to the Senate Foreign Relations Committee. The testimony, at a closed-door hearing, was later made public.

Through weeks of interviews with U.S. law-enforcement officials and experts, USA TODAY has learned new details of how extremists hide maps and photographs of terrorist targets — and post instructions for terrorist activities — on sports chat rooms, pornographic bulletin boards and other popular Web sites. Citing security concerns, officials declined to name the sites. Experts say it's difficult for law enforcement to intercept the messages.

"It's something the intelligence, law-enforcement and military communities are really struggling to deal with," says Ben Venzke, special projects director for iDEFENSE, a cyberintelligence company.

Officials and experts say the Internet is a new form of the "dead drop," a Cold War-era term for where spies left information. Officials and experts say the messages are scrambled using free encryption programs set up by groups that advocate privacy on the Internet. Those same programs also can hide maps and photographs in an existing image on selected Web sites. The e-mails and images can only be decrypted using a "private key" or code, selected by the recipient .

"The operational details and future targets, in many cases, are hidden in plain view on the Internet," Venzke says. "Only the members of the terrorist organizations, knowing the hidden signals, are able to extract the information."

Officials say bin Laden began using encryption five years ago, but recently increased its use after U.S. officials revealed they were tapping his satellite telephone calls in Afghanistan and tracking his activities.

"We will use whatever tools we can — e-mails, the Internet — to facilitate jihad against the (Israeli) occupiers and their supporters," Sheik Ahmed Yassin, the founder of the militant Muslim group Hamas said in a recent interview in the Gaza Strip. "We have the best minds working with us."

Sunday, December 14, 2008

Defending Against Statistical Steganalysis (part 3)

本篇文章延續 part 2 的討論, 將焦點鎖定在 Section 7 Analysis, 一篇完整的論文, 一定要從各方面去分析所提出的方法, 是否禁得起各方面的考驗。

作者首先交代實驗的影像來源為何?

To evaluate our correction algorithm, we embedded data into 54 pictures taken with a Fuji MX-1700 digital camera around Ann, Arbor, Michigan. The sizes of images is 640*480 pixels.

將照片從數位相機下載到電腦後, 作者設定壓縮品質參數為 75, 將影像重新壓縮, 用來模擬沒有嵌入任何機密訊息的影像。

這組影像中, 可用來嵌入機密訊息的 DCT 係數, 平均為 46,000 個, 最低為 30,000 個, 最高為 97,000。每一個　DCT 係數則提供了一個冗餘位元。在還沒嵌入機密訊息之前, 作者注意到這些冗餘位元具有較多的 1 (a strong correlation toward one):

On average 63.8% of all the bits are set with a standard derivation of 3.4% between images.

接著, 作者將 Lewis Carroll (愛麗絲夢遊仙境作者)所著的 "The Hunting of the Snark" 第一章內容壓縮, 得到 14,700 位元的機密訊息。

Figure 7: Comparison between unmodified images, images with data embedded but without ststistical corrections, and finally images with data embedded plus statistical corrections.

Figure 7 顯示只針對冗餘位元運算的簡單統計結果。針對一個完全隨機, 8 位元大小的資料來源做 Maurer test 的實驗結果是 7.184。在沒有使用校正轉換 (No corrections) 的情況下, 我們注意到 entropy 是從 6.732±0.233 增加到 6.976±0.168, 在 One-Correlation 方面, 也是從 63.41%±3.50% 減少到 59.10%±3.19%; 然而, 當使用校正(Corrections)時, 不管是 One-Correlation 還是 Maurer test, 實驗結果都非常接近沒有嵌入機密訊息的 unmodified images。若僅僅針對那些具備 a priori estimate 的影像實驗, 作者注意到使用校正的實驗結果, 不管是 One-Correction 還是 Maurer test, 實驗結果都更接近未嵌入機密訊息的原始影像。因此, 可以知道使用簡單的統計測試是偵測不到機密訊息的。

Figure 8: The extended Chi-square test detects the embedding for image that has no statistical corrections. Our naive correction is even more detectable. However, an image that receives the proper statistical correction can not be distinguished from an unmodified image.

接下來, 作者要討論的對象是 DCT 係數直方圖 (DCT frequency histogram)。作者指出在使用校正功能的情況下, 將直方圖直接畫出來是無法注意到和原始影像的差別的。Figure 8 則顯示出針對 extended Chi-square test 的實驗結果, 作者指出使用 Algorithm 1 做校正, extended Chi-square test 在影像的每一部分所計算出的嵌入機率(probability of embedding)都是接近 0 的。

Back to: Defending Against Statistical Steganalysis (part 1)
Back to: Defending Against Statistical Steganalysis (part 2)

Niels Provos, "Defending Against Statistical Steganalysis,"10th USENIX Security Symposium, August 13-17, 2001.

Thursday, December 04, 2008

Defending Against Statistical Steganalysis (part 2)

本篇文章延續 part 1 的討論, 將焦點鎖定在 Section 6 Correcting Statistical Deviations, 探討 OutGuess 0.2 是如何做到將掩護影像(cover-images) 的直方圖特性保留在偽裝影像(stego-images) 之中。

從 part 1 的討論, 我們知道 OutGuess 0.2 並沒有把所有的冗餘位元(redundant bits) 都拿來作為機密訊息嵌入之用。在 Selection process 中, 最多只允許一半的冗餘位元放置機密訊息。

隱藏學這個研究領域有一項要特別注意的就是: 通常我們並不知道將來對手會用什麼的統計測試(statistical tests) 來偵測機密訊息是否被嵌入到媒體之中。假設我們已經知道一些特定的攻擊手法, 自然就可以事先防範, 同樣用一些冗餘位元來將這些因為嵌入動作而產生的統計偏差(statistical deviation) 給修正回來。

作者首先考量的是 one-correlation 和 Maurer test。作者在本篇論文 Section 1 Introduction 中曾提及:

For example, ones and zeros are equally likely in a message that has been encrypted. However, the redundant data being replaced might have a strong correction towards either zero or one. Embedding the encrypted message weakens that correction.

Maurer test 則是一個用來測試所產生的二元亂數資料是否符合良好亂數的規範的工具 (Ueli M. Maurer, "A Universal Statistical Test for Random Bit Generators," Journal of Cryptography, Vol. 5, pp 89-105, 1992 )。基本上, 當有一個冗餘位元從 0 被改成 1, 作者便嘗試在附近找到另一個冗餘位元將其從 1 改成 0, 這樣做的優點是可以避免增加冗餘位元的亂度(entropy), 確保機密訊息無法從這種統計方式偵測出來。然而, 使用上述的方法, 針對 JPEG 影像, 會使得 DCT 係數直方圖走樣(distortion), 如 Figure 5 中的 DCT 係數 -2 與 -1 比 Figure 3 更接近, DCT 係數 2 與 3 則幾乎相同。

Figure 5: The naive statistical corrections cause the frequency of adjacent DCT coefficients to be equalized. It is immediately evident that the image is modified.

要避免 DCT 係數直方圖走樣的情況發生, 勢必要進行其他的校正(correction)工作來維持住 DCT 係數的分布。舉例來說, 如果要嵌入機密訊息到第 j-th 個 DCT 係數, 令這個係數值為 DCT(j), 假設 DCT(j) = 2i, 那這個 DCT 係數就會被改成 2i+1。要校正這個嵌入動作所產生的走樣, 就要在臨近位置找一個 DCT 係數, 其值剛好為 2i+1, 然後將其值改成 2i。令所找到的這個位置為 k, 那麼 DCT(k) 將由 2i+1 改變成 2i。如果在每一個改變原值的嵌入動作後, 就立刻執行這項校正工作, 那所產生偽裝影像的 DCT 係數直方圖將與原始的掩護影像一模一樣。

再者, 校正轉換(correcting transform) 基本上就是要保持所有的 DCT 係數 (即 frequency counts) 維持不變。因此, 以 frequency counts 為基礎的統計方式是無法偵測出原始媒體與偽裝媒體之間的差異的。

令 f 表示在直方圖中的一個 frequency count, 而 g 為其相鄰的 frequency count, 在不失其一般性的情況下, 令 f > g。令 α 表示冗餘位元用於嵌入機密訊息的比例。在嵌入機密訊息後, frequency count 變成:

f* = f - α ( f - g ) / 2 ,
g* = g + α ( f - g ) / 2 .

我們知道當嵌入量 α 為 0% 時, f 與 g 之間的差距為 ( f - g ), 當 α 越來越大時, f* 與 g* 會越來越接近, 當 α 達到 100% 時, 則

f* = f - ( f - g ) / 2 = ( f + g ) / 2 ,
g* = g + ( f - g ) / 2 = ( g + f ) / 2 .

換句話說, 當全部的冗餘位元都拿來做為嵌入機密訊息之用時, f* = g* = ( g + f ) / 2.

OutGuess 為了維持 DCT 係數直方圖不變, 必須保留足夠的冗餘位元可以在嵌入機密訊息後, 做為校正轉換之用, 因此底下的式子必須成立:

(1- α ) g ≥ α ( f - g) /2

上述式子可以化簡得到 α 的預先估計值(a priori estimate), 做為嵌入機密訊息時的參考:

α ≤ 2 g / (f + g)

給定一個機密訊息, 我們可以使用這個估計值來選擇一個用有足夠冗餘位元來保持原始 frequency counts 的掩護影像。有趣的是, 擁有更多 DCT 係數的影像, 並不代表擁有更多可供嵌入機密訊息的冗餘位元, Figure 6 說明了此一現象。Figure 6 中的每一個圈圈代表一個影像, 橫軸表示其擁有的 DCT 係數的總數, 總軸表示經過上述公式計算後, 所得到可以用來嵌入機密訊息的比例(fraction)。

Figure 6: The fraction of the DCT coefficients that can be used for data hiding does not increase linearly for images with more coefficients.

作者希望提出的校正轉換(correcting transform) 能夠具備下列兩個需求(requirements):

1. For any part of the image, the distribution of the DCT coefficients should be similar to the unmodified image.

2. The number of corrections necessary to preserve statistical properties should be small.

仔細剖析這兩個需求, 要同時達到最佳狀況是不可能的。因為如果要讓第二個需求 - 校正個數達到最小, 就要先分析整體的嵌入狀況, 再著手校正所產生整體誤差, 在這種情況下, 很可能發生某些部分影像 DCT 係數的分布, 無法跟原始分布一樣。換句話說, 要做到任何部分影像的 DCT 係數分布和原始分布一樣, 就必須在這個部分影像中, 立即把嵌入機密訊息的走樣修正回來。如此, 就不可能使校正數目達到最小。作者針對這個問題, 提出一個妥協的方法, 為每個頻率(frequency) 定義了一個臨界值 (threshold), 來表示該 frequency count 可容忍的誤差值。

Algorithm 1: This transform preserves the statistical properties of an JPEG image. It keeps track of differences in the frequency counts between original and stego medium. If the differences exceed a certain threshold, the frequency count is adjusted.

作者宣稱 Algorithm 1 可以滿足上述兩項需求。Algorithm 1 是在整個機密訊息嵌入完成之後, 才開始執行校正的。在論文中, Algorithm 1 註記了 5 個標記, 我們在此分別說明這些關鍵步驟:

標記 1 : 從原始掩護影像中, 計算 DCT 頻率的直方圖, 並將之存放於陣列 N 之中。

標記 2 : 決定每一個頻率可容忍的誤差值, 並將之存放於陣列 N* 之中。
　　　　計算方式如下:
　　　　k ← number of coefficients in image;
　　　　α← 0.03 * 5000 / k ;
　　　　N*(i) = α * N(i);
　　　　用陣列 N error 記錄每個 DCT 係數目前的累積誤差。
　　　　Nerror (i) ← 0;

標記 3 : AdjDCT ← DCT(i) ⊕ 1;

Algorithm 1 針對每一個可供嵌入的位置, 從影像的開端開始處理, 如果找到一個 DCT 係數遭到修改, 他並不會直接就去想辦法校正回來, 而是看看這個修改是否已經超出可容忍的程度, 如果只是就先去看看與其配對的 DCT 係數 AdjDCT (標記 3) 是否也存在誤差值, 如果也存在誤差值 (標記 4), 那就正好可以抵銷, 不需另外做修正, 以滿足第二個需求 - 校正越少越好。如果 AdjDCT 並不存在誤差值, 再加上本身的誤差尚未超出容忍值 (標記 5), 則暫時先不做校正動作, 僅是累計其誤差值 increment Nerror [DCT(i)]。如果誤差已經超出容忍值 N*[DCT(i)] 了, 為了滿足第一項需求, 必須要立即執行校正程序: 在附近找到配對的係數修正回來, 也就是執行 Algorithm 2: exchDCT(i, DCT(i)); 如果找不到配對的係數, 則記錄其誤差, 留待最後階段再一起做大範圍 (從影像末端 k, 開始往前找) 的校正工作。

如果想要更進一步了解 Algorithm 1 中 threshold frequencies 作用的讀者, 可以嘗試將 α 值先設定為 0, 觀察執行情形, 然後將 α 值先設定為 1, 再觀察其變化, 應該就能比較了解 α 所扮演的角色。

Algorithm 2: Find a specific DCT coefficient and change it to its adjacent DCT coefficient.

Algorithm 1 在兩種情況下會呼叫 Function exchDCT() 進行實際校正的動作:

1. 當累積的誤差 Nerror 已經超過可容任的範圍, 則會立即進行校正動作, 以免累機誤差太大, 不符合需求 1。

If the number of errors for the coefficient, Nerror[DCT(i)], can be incremented without exceeding its threshold value. If another increment is possible, we continue with the next modification. Otherwise, we have to correct the current modification in the image. The exchDCT algorithm is responsible for that.

2. 最後階段, 進行大範圍的搜尋, 將所有的誤差都校正回來。雖然, 並不是所有的誤差都可以校正回來, 但只要能夠將大部分的誤差都校正回來, 就不會被偵測出來。

After all modifications have been examined, we need to correct all remaining errors. Not all the corrections might be possible. However, if we are able to correct most of the errors, changes in the histogram are not detectable.

　
Back to: Defending Against Statistical Steganalysis (part 1)
Go to: Defending Against Statistical Steganalysis (part 3)

Niels Provos, "Defending Against Statistical Steganalysis,"10th USENIX Security Symposium, August 13-17, 2001.