fgetcsv

(PHP 4, PHP 5, PHP 7, PHP 8)

fgetcsv — 從檔案指標取得一行並解析 CSV 欄位

說明

fgetcsv(
    資源 $stream,
    ?整數 $length = null,
    字串 $separator = ",",
    字串 $enclosure = "\"",
    字串 $escape = "\\"
): 陣列|false

與 fgets() 類似，不同之處在於 fgetcsv() 會解析讀取的行，以取得 CSV（逗號分隔值）格式的欄位，並返回一個包含已讀取欄位的陣列。

注意：此函式會考慮地區設定。例如，如果 LC_CTYPE 為 en_US.UTF-8，則以某些單一位元組編碼編碼的資料可能會被錯誤地解析。

參數

stream

由 fopen()、popen() 或 fsockopen() 成功開啟的檔案之有效檔案指標。

length

必須大於 CSV 檔案中最長行的長度（以字元計），允許尾端換行字元。否則，該行會被分割成 length 個字元的區塊，除非分割發生在括住字元內。

省略此參數（或將其設定為 0，或在 PHP 8.0.0 或更高版本中設定為 null）則最大行長度不受限制，但速度會略慢。

separator

separator 參數設定欄位分隔符號。它必須是單一位元組字元。

enclosure

enclosure 參數設定欄位括住字元。它必須是單一位元組字元。

escape

escape 參數設定跳脫字元。它必須是單一位元組字元或空字串。空字串 ("") 會停用專有的跳脫機制。

注意：通常，欄位內的 enclosure 字元會透過將其加倍來跳脫；但是，可以使用 escape 字元作為替代方案。因此，對於預設參數值，"" 和 \" 具有相同的含義。除了允許跳脫 enclosure 字元之外，escape 字元沒有特殊含義；它甚至不是用來跳脫自身。

警告

從 PHP 8.4.0 開始，不建議依賴 escape 的預設值。需要透過位置或使用具名引數明確提供。

警告

當 escape 設定為非空字串 ("") 時，可能會導致 CSV 不符合 » RFC 4180 或無法透過 PHP CSV 函式來回轉換。 escape 的預設值是 "\\"，因此建議將其明確設定為空字串。預設值將在未來版本的 PHP 中更改，不早於 PHP 9.0。

回傳值

成功時返回一個包含已讀取欄位的索引陣列，失敗時返回 false。

注意事項:
CSV 檔案中的空白行將作為包含單個 null 欄位的陣列返回，並且不會被視為錯誤。

注意事項：如果 PHP 在讀取由 Macintosh 電腦建立或在其上建立的檔案時無法正確辨識行尾，啟用 auto_detect_line_endings 執行階段設定選項可能有助於解決此問題。

錯誤／例外

如果 separator 或 enclosure 不是單一位元組長，則會拋出 ValueError。

如果 escape 不是單一位元組長或空字串，則會拋出 ValueError。

更新日誌

版本	說明
8.4.0	現在不建議依賴 `escape` 的預設值。
8.3.0	如果最後一個欄位只包含未結束的 enclosure，則會傳回空字串，而不是帶有單個空位元組的字串。
8.0.0	`length` 現在可以為 null。
7.4.0	`escape` 參數現在也接受空字串來停用專有的跳脫機制。

範例

範例 #1 讀取並印出 CSV 檔案的全部內容

<?php
$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
 while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
 $num = count($data);
 echo "<p>第 $row 行有 $num 個欄位：<br /></p>\n";
 $row++;
 for ($c=0; $c < $num; $c++) {
 echo $data[$c] . "<br />\n";
 }
 }
 fclose($handle);
}
?>

參見

fputcsv() - 將行格式化為 CSV 並寫入檔案指標
str_getcsv() - 將 CSV 字串解析為陣列
SplFileObject::fgetcsv() - 從檔案取得一行並解析為 CSV 欄位
SplFileObject::fputcsv() - 將欄位陣列寫為 CSV 行
SplFileObject::setCsvControl() - 設定 CSV 的分隔符號、括住字元和跳脫字元
SplFileObject::getCsvControl() - 取得 CSV 的分隔符號、括住字元和跳脫字元
explode() - 以字串分割字串
file() - 將整個檔案讀入陣列
pack() - 將資料打包成二進位字串

發現問題了嗎？

學習如何改進此頁面 • 提交拉取請求 • 回報錯誤

＋新增筆記

使用者貢獻的筆記 31 則筆記

往上

往下

james dot ellis at gmail dot com ¶

16 年前

如果您需要設定 auto_detect_line_endings 來處理 Mac 的換行字元，這可能看起來很明顯，但請記住它應該在 fopen 之前設定，而不是之後

這樣可以 work
<?php
ini_set('auto_detect_line_endings',TRUE);
$handle = fopen('/path/to/file','r');
while ( ($data = fgetcsv($handle) ) !== FALSE ) {
//處理
}
ini_set('auto_detect_line_endings',FALSE);
?>

這樣不行，您在新行位置仍然會得到連接的欄位
<?php
$handle = fopen('/path/to/file','r');
ini_set('auto_detect_line_endings',TRUE);
while ( ($data = fgetcsv($handle) ) !== FALSE ) {
//處理
}
ini_set('auto_detect_line_endings',FALSE);
?>

往上

往下

shaun at slickdesign dot com dot au ¶

6 年前

當提供 BOM 字元時，`fgetcsv` 可能會將第一個元素用「雙引號」括起來。忽略它的最簡單方法是在使用 `fgetcsv` 之前將檔案指標移至第 4 個位元組。


<?php 
// BOM 字串，用於比較。 
$bom = "\xef\xbb\xbf"; 
 
// 從檔案開頭讀取。 
$fp = fopen($path, 'r'); 
 
// 將檔案指標向前移動，並取得前 3 個字元與 BOM 字串比較。 
if (fgets($fp, 4) !== $bom) { 
 // 未找到 BOM - 將指標重設回檔案開頭。 
 rewind($fp); 
} 
 
// 將 CSV 讀入陣列。 
$lines = array(); 
while(!feof($fp) && ($line = fgetcsv($fp)) !== false) { 
 $lines[] = $line; 
} 
?>

往上

往下

michael dot arnauts at gmail dot com ¶

12 年前

fgetcsv 似乎可以妥善處理欄位中的換行字元。 因此，實際上它不是讀取一行，而是持續讀取直到找到一個沒有被引號括住的 \n 字元。

範例

<?php
/* test.csv 的內容：
"col 1","col2","col3"
"this
is
having
multiple
lines","this not","this also not"
"normal record","nothing to see here","no data"
*/

$handle = fopen("test.csv", "r");
while (($data = fgetcsv($handle)) !== FALSE) {
 var_dump($data);
}
?>

回傳值
array(3) {
  [0]=>
string(5) "col 1"
  [1]=>
string(4) "col2"
  [2]=>
string(4) "col3"
}
array(3) {
  [0]=>
string(29) "this
is
having
multiple
lines"
  [1]=>
string(8) "this not"
  [2]=>
string(13) "this also not"
}
array(3) {
  [0]=>
string(13) "normal record"
  [1]=>
string(19) "nothing to see here"
  [2]=>
string(7) "no data"
}

這表示您可以預期 fgetcsv 能夠妥善處理欄位中的換行字元。 文件中並沒有清楚說明這一點。

往上

往下

Gandalf the White ¶

7 年前

忘掉這個 while() 迴圈的繁瑣寫法吧！ 使用以下程式碼：

$rows = array_map('str_getcsv', file('myfile.csv'));
$header = array_shift($rows);
$csv = array();
foreach ($rows as $row) {
$csv[] = array_combine($header, $row);
}

來源：https://steindom.com/articles/shortest-php-code-convert-csv-associative-array

往上

往下

myrddin at myrddin dot myrddin ¶

18 年前

這裡有一個基於物件導向程式設計 (OOP) 的匯入器，類似先前發佈的那個。然而，這個更具彈性，因為您可以匯入大型檔案而不會耗盡記憶體，您只需要在 get() 方法上使用限制即可。


小型檔案的使用範例：
-------------------------------------

<?php 
$importer = new CsvImporter("small.txt",true); 
$data = $importer->get(); 
print_r($data); 
?> 



大型檔案的使用範例：
-------------------------------------

<?php 
$importer = new CsvImporter("large.txt",true); 
while($data = $importer->get(2000)) 
{ 
print_r($data); 
} 
?> 



以下是類別程式碼：
-------------------------------------

<?php 
class CsvImporter 
{ 
 private $fp; 
 private $parse_header; 
 private $header; 
 private $delimiter; 
 private $length; 
 //-------------------------------------------------------------------- 
 function __construct($file_name, $parse_header=false, $delimiter="\t", $length=8000) 
 { 
 $this->fp = fopen($file_name, "r"); 
 $this->parse_header = $parse_header; 
 $this->delimiter = $delimiter; 
 $this->length = $length; 
 $this->lines = $lines; 
 
 if ($this->parse_header) 
 { 
 $this->header = fgetcsv($this->fp, $this->length, $this->delimiter); 
 } 
 
 } 
 //-------------------------------------------------------------------- 
 function __destruct() 
 { 
 if ($this->fp) 
 { 
 fclose($this->fp); 
 } 
 } 
 //-------------------------------------------------------------------- 
 function get($max_lines=0) 
 { 
 //if $max_lines is set to 0, then get all the data 
 
 $data = array(); 
 
 if ($max_lines > 0) 
 $line_count = 0; 
 else 
 $line_count = -1; // so loop limit is ignored 
 
 while ($line_count < $max_lines && ($row = fgetcsv($this->fp, $this->length, $this->delimiter)) !== FALSE) 
 { 
 if ($this->parse_header) 
 { 
 foreach ($this->header as $i => $heading_i) 
 { 
 $row_new[$heading_i] = $row[$i]; 
 } 
 $data[] = $row_new; 
 } 
 else 
 { 
 $data[] = $row; 
 } 
 
 if ($max_lines > 0) 
 $line_count++; 
 } 
 return $data; 
 } 
 //-------------------------------------------------------------------- 
 
} 
?>

往上

往下

chris at ocproducts dot com ¶

7 年前

這個函式沒有特殊的 BOM (位元組順序記號) 處理。第一列的第一個儲存格將繼承 BOM 位元組，也就是會比預期的長 3 個位元組。由於 BOM 是不可見的，您可能不會注意到。

Windows 上的 Excel 或像是記事本的文字編輯器可能會加入 BOM。

往上

往下

jc at goetc dot net ¶

20 年前

我最近有很多專案都與 CSV 檔案有關，所以我建立了以下類別來讀取 CSV 檔案並返回一個以欄名作為鍵值的二維陣列。唯一的要求是第一列必須包含欄位標題。


我今天才寫好它，所以在不久的將來我可能會擴充它。


<?php 
class CSVparse 
 { 
 var $mappings = array(); 
 
 function parse_file($filename) 
 { 
 $id = fopen($filename, "r"); //開啟檔案 
 $data = fgetcsv($id, filesize($filename)); /*取得*/ 
 /*主要欄位名稱*/ 
 
 if( !$this->mappings ) 
 $this->mappings = $data; 
 
 while($data = fgetcsv($id, filesize($filename))) 
 { 
 if($data[0]) 
 { 
 foreach($data as $key => $value) 
 $converted_data[$this->mappings[$key]] = addslashes($value); 
 $table[] = $converted_data; /* 將每一行放入 */ 
 } /* $table 陣列中 */ 
 } /* 作為一個獨立的項目 */ 
 fclose($id); //關閉檔案 
 return $table; 
 } 
 } 
?>

往上

往下

tomasz at marcinkowski dot pl ¶

11 年前

對於其他仍在為單一位元組編碼中非拉丁字元消失所苦惱的人來說，設定 LANG 環境變數（如手冊所述）根本沒有幫助。請改為查看 LC_ALL。

在我的例子中，它被設定為「pl_PL.utf8」，但由於我的輸入檔案是 CP1250 編碼，大部分波蘭文字（但並非全部！）都消失了，「Łódź」這個城市變成了「dź」。我用「pl_PL」把它「修復」了。

往上

往下

phpnet at smallfryhosting dot co dot uk ¶

21 年前

另一個版本 [修改自 michael from mediaconcepts]


<?php 
 函數 arrayFromCSV($file, $hasFieldNames = false, $delimiter = ',', $enclosure='') { 
 $result = 陣列(); 
 $size = filesize($file) +1; 
 $file = fopen($file, 'r'); 
 #待辦事項：必定有更好的方法來找出最長一列的大小... 在找到之前 
 如果 ($hasFieldNames) $keys = fgetcsv($file, $size, $delimiter, $enclosure); 
 當 ($row = fgetcsv($file, $size, $delimiter, $enclosure)) { 
 $n = count($row); $res=陣列(); 
 對於($i = 0; $i < $n; $i++) { 
 $idx = ($hasFieldNames) ? $keys[$i] : $i; 
 $res[$idx] = $row[$i]; 
 } 
 $result[] = $res; 
 } 
 fclose($file); 
 返回 $result; 
 } 
?>

往上

往下

michael dot martinek at gmail dot com ¶

16 年前

這是我今天早上寫的東西。它允許您從 CSV 讀取列，並根據欄位名稱取得值。當您的標題欄位順序不總是相同時，例如當您處理來自不同客戶的許多 Feed 時，這非常有用。也讓程式碼更簡潔、更容易管理。


所以如果您的 Feed 看起來像這樣


product_id,category_name,price,brand_name,sku_isbn_upc,image_url,landing_url,title,description
123,Test Category,12.50,No Brand,0,http://www.example.com, http://www.example.com/landing.php, Some Title,Some Description


您可以這樣做
<?php 
while ($o->getNext()) 
{ 
 $dPrice = $o->getPrice(); 
 $nProductID = $o->getProductID(); 
 $sBrandName = $o->getBrandName(); 
} 
?> 

如果您對此類別有任何問題或意見，可以寄到 michael.martinek@gmail.com，因為我可能不會再回來查看這裡。


<?php 
 define('C_PPCSV_HEADER_RAW', 0); 
 define('C_PPCSV_HEADER_NICE', 1); 
 
 class PaperPear_CSVParser 
 { 
 private $m_saHeader = array(); 
 private $m_sFileName = ''; 
 private $m_fp = false; 
 private $m_naHeaderMap = array(); 
 private $m_saValues = array(); 
 
 function __construct($sFileName) 
 { 
 //quick and dirty opening and processing.. you may wish to clean this up 
 if ($this->m_fp = fopen($sFileName, 'r')) 
 { 
 $this->processHeader(); 
 } 
 } 
 
 function __call($sMethodName, $saArgs) 
 { 
 //check to see if this is a set() or get() request, and extract the name 
 if (preg_match("/[sg]et(.*)/", $sMethodName, $saFound)) 
 { 
 //convert the name portion of the [gs]et to uppercase for header checking 
 $sName = strtoupper($saFound[1]); 
 
 //see if the entry exists in our named header-> index mapping 
 if (array_key_exists($sName, $this->m_naHeaderMap)) 
 { 
 //it does.. so consult the header map for which index this header controls 
 $nIndex = $this->m_naHeaderMap[$sName]; 
 if ($sMethodName{0} == 'g') 
 { 
 //return the value stored in the index associated with this name 
 return $this->m_saValues[$nIndex]; 
 } 
 else 
 { 
 //set the valuw 
 $this->m_saValues[$nIndex] = $saArgs[0]; 
 return true; 
 } 
 } 
 } 
 
 //nothing we control so bail out with a false 
 return false; 
 } 
 
 //get a nicely formatted header name. This will take product_id and make 
 //it PRODUCTID in the header map. So now you won't need to worry about whether you need 
 //to do a getProductID, or getproductid, or getProductId.. all will work. 
 public static function GetNiceHeaderName($sName) 
 { 
 return strtoupper(preg_replace('/[^A-Za-z0-9]/', '', $sName)); 
 } 
 
 //process the header entry so we can map our named header fields to a numerical index, which 
 //we'll use when we use fgetcsv(). 
 private function processHeader() 
 { 
 $sLine = fgets($this->m_fp); 
 //you'll want to make this configurable 
 $saFields = split(",", $sLine); 
 
 $nIndex = 0; 
 foreach ($saFields as $sField) 
 { 
 //get the nice name to use for "get" and "set". 
 $sField = trim($sField); 
 
 $sNiceName = PaperPear_CSVParser::GetNiceHeaderName($sField); 
 
 //track correlation of raw -> nice name so we don't have to do on-the-fly nice name checks 
 $this->m_saHeader[$nIndex] = array(C_PPCSV_HEADER_RAW => $sField, C_PPCSV_HEADER_NICE => $sNiceName); 
 $this->m_naHeaderMap[$sNiceName] = $nIndex; 
 $nIndex++; 
 } 
 } 
 
 //read the next CSV entry 
 public function getNext() 
 { 
 //this is a basic read, you will likely want to change this to accomodate what 
 //you are using for CSV parameters (tabs, encapsulation, etc). 
 if (($saValues = fgetcsv($this->m_fp)) !== false) 
 { 
 $this->m_saValues = $saValues; 
 return true; 
 } 
 return false; 
 } 
 } 
 
 
 //quick example of usage 
 $o = new PaperPear_CSVParser('F:\foo.csv'); 
 while ($o->getNext()) 
 { 
 echo "Price=" . $o->getPrice() . "\r\n"; 
 } 
 
?>

往上

往下

kent at marketruler dot com ¶

14 年前

請注意，fgetcsv 至少在 PHP 5.3 或更早版本中無法處理 UTF-16 編碼的檔案。您的選擇是將整個檔案轉換為 ISO-8859-1（或 latin1），或逐行轉換並將每一行轉換為 ISO-8859-1 編碼，然後使用 str_getcsv（或相容的向下相容的實作）。如果您需要讀取非拉丁字母，最好轉換為 UTF-8。

關於 PHP < 5.3 的向下相容版本，請參閱 str_getcsv，並參閱 Rasmus Andersson 編寫的提供 utf16_decode 函式的 utf8_decode。我添加的修改是 BOM 出現在檔案的頂部，而不是後續的行中。因此，您需要儲存 endianness，然後在解碼每一行時重新發送它。如果無法取得 endianness，這個修改後的版本會回傳它。

<?php
/**
 * Decode UTF-16 encoded strings.
 *
 * Can handle both BOM'ed data and un-BOM'ed data.
 * Assumes Big-Endian byte order if no BOM is available.
 * From: https://php.dev.org.tw/manual/en/function.utf8-decode.php
 *
 * @param string $str UTF-16 encoded data to decode.
 * @return string UTF-8 / ISO encoded data.
 * @access public
 * @version 0.1 / 2005-01-19
 * @author Rasmus Andersson {@link http://rasmusandersson.se/}
 * @package Groupies
 */
function utf16_decode($str, &$be=null) {
 if (strlen($str) < 2) {
 return $str;
 }
 $c0 = ord($str{0});
 $c1 = ord($str{1});
 $start = 0;
 if ($c0 == 0xFE && $c1 == 0xFF) {
 $be = true;
 $start = 2;
 } else if ($c0 == 0xFF && $c1 == 0xFE) {
 $start = 2;
 $be = false;
 }
 if ($be === null) {
 $be = true;
 }
 $len = strlen($str);
 $newstr = '';
 for ($i = $start; $i < $len; $i += 2) {
 if ($be) {
 $val = ord($str{$i}) << 4;
 $val += ord($str{$i+1});
 } else {
 $val = ord($str{$i+1}) << 4;
 $val += ord($str{$i});
 }
 $newstr .= ($val == 0x228) ? "\n" : chr($val);
 }
 return $newstr;
}
?>

嘗試「setlocale」技巧對我來說沒有用，例如：

<?php
setlocale(LC_CTYPE, "en.UTF16");
$line = fgetcsv($file, ...)
?>

但這可能是因為我的平台不支援它。然而，fgetcsv 只支援單個字元作為分隔符號等，如果您傳入該字元的 UTF-16 版本，它會發出抱怨，所以我很快就放棄了。

希望這對其他人有幫助。

往上

往下

sander at NOSPAM dot rotorsolutions dot nl ¶

11 年前

如果您不想定義一個括住字元，您可以執行以下操作

<?php
 $row = fgetcsv($handle, 0, $delimiter, 0x00);
?>

我需要這個來偵測 csv 檔案使用的括住字元。

往上

往下

junk at vhd dot com dot au ¶

19 年前

fgetcsv 函式似乎遵循 MS Excel 的慣例，這表示

- 引號字元會由自身跳脫，而不是反斜線。
（例如，讓我們使用雙引號 (") 作為引號字元
 
兩個雙引號 "" 在解析後會變成一個單個 "，如果它們在引號欄位內（否則它們都不會被移除）。

\" 會產生 \"，無論它是否在引號欄位內（\\ 也一樣），並且

如果單個雙引號在引號欄位內，它將被移除。如果它不在引號欄位內，它將保留）。

- 前後空格（\s 或 \t）永遠不會被移除，無論它們是否在引號欄位內。

- 如果欄位內的換行符號在引號欄位內，則會被正確處理。（因此，先前的評論指出相反的說法是錯誤的，除非他們使用不同的 PHP 版本......我使用的是 4.4.0。）

所以，fgetcsv 其實非常完整，可以處理所有可能的情況。（不過，正如說明文件中提到的，它確實需要一些協助來處理 Macintosh 的換行符號。）

真希望一開始就知道這些。根據我自己的基準測試，fgetcsv 在記憶體消耗和速度之間取得了很好的平衡。

-------------------------
注意：如果使用反斜線來跳脫引號，之後可以輕鬆地將它們移除。開頭和結尾的空格也是如此。

往上

往下

nick at atomicdesign dot net ¶

12 年前

我在迭代 CSV 檔案時遇到記憶體耗盡的錯誤。 使用 ini_set('auto_detect_line_endings', 1); 解決了這個問題。

往上

往下

matthias dot isler at gmail dot com ¶

14 年前

如果您想為您的應用程式載入一些翻譯，即使比較容易處理，也不要使用 csv 檔案。


以下程式碼片段


<?php 
$lang = array(); 
 
$handle = fopen('en.csv', 'r'); 
 
while($row = fgetcsv($handle, 500, ';')) 
{ 
 $lang[$row[0]] = $row[1]; 
} 
 
fclose($handle); 
?> 

比以下程式碼慢大約 400%


<?php 
$lang = array(); 
 
$values = parse_ini_file('de.ini'); 
 
foreach($values as $key => $val) 
{ 
 $lang[$key] = $val; 
} 
?> 

這就是為什麼您應該始終使用 .ini 檔案進行翻譯的原因……


https://php.dev.org.tw/parse_ini_file

往上

往下

from_php at puggan dot se ¶

8 年前

設定 $escape 參數並不會傳回未跳脫的字串，而只是避免在 $delimiter 前面有跳脫字元時，以此分割字串。

<?php
 $tmp_file = "/tmp/test.csv";
 file_put_contents($tmp_file, "\"first\\\";\\\"secound\"");
 echo "raw:" . PHP_EOL . file_get_contents($tmp_file) . PHP_EOL . PHP_EOL;

 echo "使用反斜線跳脫 fgetcsv:" . PHP_EOL;
 $f = fopen($tmp_file, 'r');
 while($r = fgetcsv($f, 1024, ';', '"', "\\"))
 {
 print_r($r);
 }
 fclose($f);
 echo PHP_EOL;

 echo "使用 # 跳脫 fgetcsv:" . PHP_EOL;
 $f = fopen($tmp_file, 'r');
 while($r = fgetcsv($f, 1024, ';', '"', "#"))
 {
 print_r($r);
 }
 fclose($f);
 echo PHP_EOL;
?>

往上

往下

daniel at softel dot jp ¶

18 年前

請注意，fgetcsv() 會使用系統的語系設定來推測字元編碼。
因此，如果您試圖在 EUC-JP 伺服器上處理 UTF-8 CSV 檔案（例如），
您需要在呼叫 fgetcsv() 之前執行以下操作：

setlocale(LC_ALL, 'ja_JP.UTF8');

[另請注意，setlocale() 並不會*永久地*影響系統的語系設定]

往上

往下

code at ashleyhunt dot co dot uk ¶

14 年前

我需要一個函式來在使用 LOAD DATA LOCAL INFILE 將檔案匯入 MySQL 之前，分析檔案的分隔符號和換行符號。

我寫了這個函式來完成這項工作，結果（大部分）非常準確，而且它也適用於大型檔案。
<?php
function analyse_file($file, $capture_limit_in_kb = 10) {
 // capture starting memory usage
 $output['peak_mem']['start'] = memory_get_peak_usage(true);

 // log the limit how much of the file was sampled (in Kb)
 $output['read_kb'] = $capture_limit_in_kb;
 
 // read in file
 $fh = fopen($file, 'r');
 $contents = fread($fh, ($capture_limit_in_kb * 1024)); // in KB
 fclose($fh);
 
 // specify allowed field delimiters
 $delimiters = array(
 'comma' => ',',
 'semicolon' => ';',
 'tab' => "\t",
 'pipe' => '|',
 'colon' => ':'
 );
 
 // specify allowed line endings
 $line_endings = array(
 'rn' => "\r\n",
 'n' => "\n",
 'r' => "\r",
 'nr' => "\n\r"
 );
 
 // loop and count each line ending instance
 foreach ($line_endings as $key => $value) {
 $line_result[$key] = substr_count($contents, $value);
 }
 
 // sort by largest array value
 asort($line_result);
 
 // log to output array
 $output['line_ending']['results'] = $line_result;
 $output['line_ending']['count'] = end($line_result);
 $output['line_ending']['key'] = key($line_result);
 $output['line_ending']['value'] = $line_endings[$output['line_ending']['key']];
 $lines = explode($output['line_ending']['value'], $contents);
 
 // remove last line of array, as this maybe incomplete?
 array_pop($lines);
 
 // create a string from the legal lines
 $complete_lines = implode(' ', $lines);
 
 // log statistics to output array
 $output['lines']['count'] = count($lines);
 $output['lines']['length'] = strlen($complete_lines);
 
 // loop and count each delimiter instance
 foreach ($delimiters as $delimiter_key => $delimiter) {
 $delimiter_result[$delimiter_key] = substr_count($complete_lines, $delimiter);
 }
 
 // sort by largest array value
 asort($delimiter_result);
 
 // log statistics to output array with largest counts as the value
 $output['delimiter']['results'] = $delimiter_result;
 $output['delimiter']['count'] = end($delimiter_result);
 $output['delimiter']['key'] = key($delimiter_result);
 $output['delimiter']['value'] = $delimiters[$output['delimiter']['key']];
 
 // capture ending memory usage
 $output['peak_mem']['end'] = memory_get_peak_usage(true);
 return $output;
}
?>

使用範例
<?php
$Array = analyse_file('/www/files/file.csv', 10);

// 可用部分範例
// $Array['delimiter']['value'] => ,
// $Array['line_ending']['value'] => \r\n
?>

完整的函式輸出
陣列
(
[peak_mem] => 陣列
        (
[start] => 786432
[end] => 786432
        )

[line_ending] => 陣列
        (
[results] => 陣列
                (
[nr] => 0
[r] => 4
[n] => 4
[rn] => 4
                )

[count] => 4
[key] => rn
[value] =>

        )

[lines] => 陣列
        (
[count] => 4
[長度] => 94
        )

[分隔符號] => 陣列
        (
[results] => 陣列
                (
[冒號] => 0
[分號] => 0
[管線符號] => 0
[定位鍵] => 1
[逗號] => 17
                )

[計數] => 17
[鍵值] => 逗號
[數值] => ,
        )

[讀取KB] => 10
)

享受吧！

Ashley

往上

往下

jonathangrice at yahoo dot com ¶

14 年前

這是將 csv 檔案讀入多維陣列的方法。

 <?php
 # 開啟檔案。
 if (($handle = fopen("file.csv", "r")) !== FALSE) {
 # 將父多維陣列鍵值設為 0。
 $nn = 0;
 while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
 # 計算該列的總鍵值數。
 $c = count($data);
 # 填入多維陣列。
 for ($x=0;$x<$c;$x++)
 {
 $csvarray[$nn][$x] = $data[$x];
 }
 $nn++;
 }
 # 關閉檔案。
 fclose($handle);
 }
 # 印出多維陣列的內容。
 print_r($csvarray);
?>

往上

往下

matasbi at gmail dot com ¶

13 年前

從 Microsoft Excel「Unicode 文字 (*.txt)」格式剖析

<?php
函式 parse($file) {
 if (($handle = fopen($file, "r")) === FALSE) return;
 while (($cols = fgetcsv($handle, 1000, "\t")) !== FALSE) {
 foreach( $cols as $key => $val ) {
 $cols[$key] = trim( $cols[$key] );
 $cols[$key] = iconv('UCS-2', 'UTF-8', $cols[$key]."\0") ;
 $cols[$key] = str_replace('""', '"', $cols[$key]);
 $cols[$key] = preg_replace("/^\"(.*)\"$/sim", "$1", $cols[$key]);
 }
 echo print_r($cols, 1);
 }
}
?>

往上

往下

ifedinachukwu at yahoo dot com ¶

13 年前

我有一個 CSV 檔案，其中的欄位包含了具有換行符號的資料（在 HTML 文字區域按下 Enter 鍵產生的 CRLF）。當然，在建立 CSV 的過程中，MySQL 會將這些欄位中的 LF 字元跳脫。問題是我無法讓 fgetcsv 在這裡正常運作，因為每個 LF 都被視為 CSV 檔案一行的結尾，即使它已被跳脫！

因為我想要的是取得 CSV 檔案的**第一行**，然後透過用未跳脫的逗號分割來計算欄位數，所以我不得不採用這種方法。

<?php
/*
CSV 檔案的前五行：第四行的一個資料欄位內含有換行符號。 LF 代表換行符號或 \n
1,okonkwo joseph,nil,2010-01-12 17:41:40LF
2,okafor john,cq and sulphonamides,2010-01-12 17:58:03LF
3,okoye andrew,lives with hubby in abuja,2011-03-30 13:39:19LF
4,okeke peter,In 2001\, had appendicectomy in AbaCR
\LF
In 2004\, had ELCS at a private hoapital in Lagos,2011-03-30 13:39:19LF
5,adewale chris,cq and sulphonamides,2010-01-12 17:58:03LF

*/

 $fp = fopen('file.csv', 'r');
 $i = 1;
 $str='';
 $srch='';
 while (false !== ($char = fgetc($fp))) {
 $str .= $char;//用於收集要輸出的字串
 $srch .= $char;//用於搜尋 LF，可能前面有反斜線\
 if(strlen($srch) > 2){
 $srch = substr($srch, 1);//也就是去除第一個字元
 }
 if($i > 1 && $srch[1] == chr(10) && $srch[0] != '\\'){//chr(10) 是 LF，也就是 \n
 break;//如果遇到沒有反斜線\ 的 \n，那就是真正的行尾，停止收集字串；
 }
 
 $i++;
 }
 echo $str;//應該包含第一行字串

?>
或許有更簡潔的解決方案，如果有的話，我很樂意知道！

往上

往下

jaimthorn at yahoo dot com ¶

15 年前

我使用 fgetcsv 來讀取以管道符號分隔的資料檔案，並遇到了以下的怪異情況。

資料檔案包含類似這樣的資料：

RECNUM|TEXT|COMMENT
1|hi!|some comment
2|"error!|another comment
3|where does this go?|yet another comment
4|the end!"|last comment

我這樣讀取檔案：

<?php
$row = fgetcsv( $fi, $length, '|' );
?>

這在記錄 2 造成了一個問題：管道符號後面的引號導致檔案被讀取到下一個引號——在這個例子中，是在記錄 4。兩者之間的所有內容都被儲存在 $row 的單個元素中。

在這個特定情況下，很容易發現問題，但我的腳本正在處理數千條記錄，我花了一些時間才弄清楚出了什麼問題。

令人惱火的是，似乎沒有簡潔的解決方法。你無法告訴 PHP 不要使用括號——例如，像這樣

<?php
$row = fgetcsv( $fi, $length, '|', '' );
?>

（嗯，你可以這樣告訴 PHP，但它不會work。）

所以你必須 resorting to a solution where you use an extremely unlikely enclosure, 但由於 enclosure 只能是一個字元長，可能很難找到。

或者（在我看來更優雅），你可以選擇像這樣讀取這些檔案：

<?php
$line = fgets( $fi, $length );
$row = explode( '|', $line );
?>

因為它更直觀且更有彈性，我決定從現在開始偏好這個「結構」而不是 fgetcsv。

往上

往下

vladimir at luchaninov dot com ¶

9 年前

以下是如何使用這個函數搭配 generators 的範例
https://github.com/luchaninov/csv-file-loader (composer require "luchaninov/csv-file-loader:1.*")

$loader = new CsvFileLoader();
$loader->setFilename('/path/to/your_data.csv');

foreach ($loader->getItems() as $item) {
var_dump($item); // 在這裡做一些事情
}

如果你有像這樣的 CSV 檔案

id,name,surname
1,Jack,Black
2,John,Doe

你會得到 2 個 items

['id' => '1', 'name' => 'Jack', 'surname' => 'Black']
['id' => '2', 'name' => 'John', 'surname' => 'Doe']

往上

往下

Daniel Klein ¶

8 年前

$escape 參數完全不直觀，但它並沒有壞掉。以下是 fgetcsv() 行為的分解。在範例中，我使用底線 (_) 來顯示空格，並使用括號 ([]) 來顯示個別欄位

- 如果每個欄位中的前導空白緊接在 enclosure 之前，則會被移除： ___"foo" -> [foo]
- 每個欄位只能有一個 enclosure，儘管它會與結束 enclosure 和下一個分界符號/新行之間出現的任何資料串連，包括任何尾端空白 ___"foo"_"bar"__ -> [foo_"bar"__]
- 如果欄位不是以（前導空白 +）enclosure 開頭，則整個欄位會被解讀為原始資料，即使 enclosure 字元出現在欄位中的其他位置： _foo"bar"_ -> [_foo"bar"_]
- 分界符號不能在 enclosure 外面被跳脫，它們必須被 enclosure 包起來。分界符號不需要在 enclosure 內被跳脫："foo,bar","baz,qux" -> [foo,bar][baz,qux]; foo\,bar -> [foo\][bar]; "foo\,bar" -> [foo\,bar]
- 單一 enclosure 內的雙 enclosure 會被轉換為單一 enclosure："foobar" -> [foobar]; "foo""bar" -> [foo"bar]; """foo""" -> ["foo"]; ""foo"" -> [foo""] （空的 enclosure 後面跟著原始資料）
- $escape 參數按預期工作，但與 enclosure 不同，它不會被反跳脫。必須在程式碼的其他地方反跳脫資料： "\"foo\"" -> [\"foo\"]; "foo\"bar" -> [foo\"bar]

注意：以下數據（這是一個非常常見的問題）無效：「\」。它的結構等同於「@」，換句話說，它是一個開放的封閉符、一些數據，然後沒有閉合封閉符。

可以使用以下函數來獲得預期的行為

<?php
// Removes escape characters before both enclosures and escapes, but leaves everything else untouched, similiar to single quoting
function fgetcsv_unescape_enclosures_and_escapes($fh, $length = 0, $delimiter = ',', $enclosure = '"', $escape = '\\') {
 $fields = fgetcsv($fh, $length, $delimiter, $enclosure, $escape);
 if ($fields) {
 $regex_enclosure = preg_quote($enclosure);
 $regex_escape = preg_quote($escape);
 $fields = preg_replace("/{$regex_escape}({$regex_enclosure}|{$regex_escape})/", '$1', $fields);
 }
 return $fields;
}

// Does NOT remove a lone escape character at the end of a field
function fgetcsv_unescape_all($fh, $length = 0, $delimiter = ',', $enclosure = '"', $escape = '\\') {
 $fields = fgetcsv($fh, $length, $delimiter, $enclosure, $escape);
 if ($fields) {
 $regex_escape = preg_quote($escape);
 $fields = preg_replace("/{$regex_escape}(.)/s", '$1', $fields);
 }
 return $fields;
}

// Removes lone escape characters at the end of fields
function fgetcsv_unescape_all_strip_last($fh, $length = 0, $delimiter = ',', $enclosure = '"', $escape = '\\') {
 $fields = fgetcsv($fh, $length, $delimiter, $enclosure, $escape);
 if ($fields) {
 $regex_escape = preg_quote($escape);
 $fields = preg_replace("/{$regex_escape}(.?)/s", '$1', $fields);
 }
 return $fields;
}
?>

警告：理想情況下，封閉符號外不應該有任何未跳脫的跳脫字元；欄位應該被封閉並跳脫。如果有任何未跳脫的跳脫字元，它們最終也可能會被移除，具體取決於所使用的函數。

往上

往下

mortanon at gmail dot com ¶

19 年前

這裡是一個 CSV 迭代器的例子。

<?php
class CsvIterator implements Iterator
{
 const ROW_SIZE = 4096;
 /**
 * The pointer to the cvs file.
 * @var resource
 * @access private
 */
 private $filePointer = null;
 /**
 * The current element, which will 
 * be returned on each iteration.
 * @var array
 * @access private
 */
 private $currentElement = null;
 /**
 * The row counter. 
 * @var int
 * @access private
 */
 private $rowCounter = null;
 /**
 * The delimiter for the csv file. 
 * @var str
 * @access private
 */
 private $delimiter = null;

 /**
 * This is the constructor.It try to open the csv file.The method throws an exception
 * on failure.
 *
 * @access public
 * @param str $file The csv file.
 * @param str $delimiter The delimiter.
 *
 * @throws Exception
 */
 public function __construct($file, $delimiter=',')
 {
 try {
 $this->filePointer = fopen($file, 'r');
 $this->delimiter = $delimiter;
 }
 catch (Exception $e) {
 throw new Exception('The file "'.$file.'" cannot be read.');
 } 
 }

 /**
 * This method resets the file pointer.
 *
 * @access public
 */
 public function rewind() {
 $this->rowCounter = 0;
 rewind($this->filePointer);
 }

 /**
 * This method returns the current csv row as a 2 dimensional array
 *
 * @access public
 * @return array The current csv row as a 2 dimensional array
 */
 public function current() {
 $this->currentElement = fgetcsv($this->filePointer, self::ROW_SIZE, $this->delimiter);
 $this->rowCounter++; 
 return $this->currentElement;
 }

 /**
 * This method returns the current row number.
 *
 * @access public
 * @return int The current row number
 */
 public function key() {
 return $this->rowCounter;
 }

 /**
 * This method checks if the end of file is reached.
 *
 * @access public
 * @return boolean Returns true on EOF reached, false otherwise.
 */
 public function next() {
 return !feof($this->filePointer);
 }

 /**
 * This method checks if the next row is a valid row.
 *
 * @access public
 * @return boolean If the next row is a valid row.
 */
 public function valid() {
 if (!$this->next()) {
 fclose($this->filePointer);
 return false;
 }
 return true;
 }
}
?>

用法

<?php
$csvIterator = new CsvIterator('/path/to/csvfile.csv');
foreach ($csvIterator as $row => $data) {
 // 使用 $data 進行操作
}
?>

往上

往下

Xander ¶

14 年前

我遇到了多位元組的問題。檔案是 windows-1250 編碼，腳本是 UTF-8 編碼，而且 set_locale 無法正常運作，所以我做了一個簡單又安全的解決方法


<?php 
$fc = iconv('windows-1250', 'utf-8', file_get_contents($_FILES['csv']['tmp_name'])); 
 
 file_put_contents('tmp/import.tmp', $fc); 
 $handle = fopen('tmp/import.tmp', "r"); 
 $rows = array(); 
 while (($data = fgetcsv($handle, 0, ";")) !== FALSE) { 
 
 $rows[] = $data; 
 
 } 
 fclose($handle); 
 unlink('tmp/import.tmp'); 
?> 

希望您會覺得它有用。
抱歉我的英文不好。

往上

往下

匿名 ¶

18 年前

請注意二進制值為 0 的字元，因為它們似乎會使 fgetcsv 忽略出現這些字元之後的行剩餘部分。

也許這在某些我不知道的慣例下是正常的，但是從 Excel 匯出的檔案*有時*會將這些字元作為某些儲存格的值，因此 fgetcsv 會為不同的行返回不同的儲存格數量。

我使用的是 php 4.3

往上

往下

kurtnorgaz at web dot de ¶

21 年前

您應該注意 "fgetcsv" 在讀取檔案時會移除開頭的 TAB 字元 "chr(9)"。


這表示如果您的檔案中第一個字元是 chr(9)，而您使用 fgetcsv，則該字元會被自動刪除。


範例
檔案內容
chr(9)first#second#third#fourth


程式碼
<?php $line = fgetcsv($handle,500,"#"); ?> 

陣列 $line 看起來像這樣
$line[0] = first
$line[1] = second
$line[2] = third
$line[3] = fourth


而不是
$line[0] = chr(9)first
$line[1] = second
$line[2] = third
$line[3] = fourth


所有在其他字元後的 chr(9) 都沒有被刪除！


範例
檔案內容
Achr(9)first#second#third#fourth


程式碼
<?php $line = fgetcsv($handle,500,"#"); ?> 

陣列 $line 看起來像這樣
$line[0] = Achr(9)first
$line[1] = second
$line[2] = third
$line[3] = fourth

往上

往下

lzsiga at freemail dot c3 dot hu ¶

2 個月前

有一個特殊的語法可以防止 Excel 自動將欄位內容轉換為日期或浮點數：="fieldcontent"（開頭為等號）。（請注意，如果內容包含換行符號或欄位分隔符號，則不應使用此語法。）

目前 fgetcvs 不支援此語法，但可以透過一些後處理來實現。

往上

往下

kamil dot dratwa at gmail dot com ¶

3 年前

長度參數行為描述的這一部分有點 tricky，因為它沒有提到分隔符號也被視為一個字元並轉換為空字串：「否則，該行將被分割成 length 個字元的區塊 (...)」。

首先，讓我們看一下讀取不包含分隔符號的行的範例

<?php
 file_put_contents('data.csv', 'foo'); // 沒有分隔符號
 $handle = fopen('data.csv', 'c+');
 $data = fgetcsv($handle, 2);
 var_dump($data);
?>

上述範例將輸出
array(1) {
  [0]=>
string(2) "fo"
}

現在讓我們加入分隔符號

<?php
 file_put_contents('data.csv', 'f,o,o'); // 逗號作為分隔符號
 $handle = fopen('data.csv', 'c+');
 $data = fgetcsv($handle, 2);
 var_dump($data);
?>

第二個範例將輸出

array(2) {
  [0]=>
string(1) "f"
  [1]=>
string(0) ""
}

現在讓我們修改長度

<?php
file_put_contents('data.csv', 'f,o,o');
$handle = fopen('data.csv', 'c+');
$data = fgetcsv($handle, 3); // 注意更新後的長度
var_dump($data);
?>

最後範例的輸出為

array(2) {
  [0]=>
string(1) "f"
  [1]=>
字串(1) "o"
}

最終結論是，在將行分割成區塊時，分隔符號在讀取期間被視為一個字元，但隨後會被轉換為空字串。此外，如果分隔符號位於區塊的最開頭或最後位置，它將被包含在結果陣列中，但如果它位於其他字元之間，則它將被忽略。

往上

往下

lewiscowles at me dot com ¶

4 年前

如果有人在處理位元組順序標記(BOM)時遇到困難，以下程式碼應該可以運作。像往常一樣，不提供任何保證，您應該測試您的程式碼... 這僅適用於 UTF-8

    <?php

//...

$fh = fopen('wut.csv', 'r');
$firstThreeBytes = fread($fh, 3);
if ($firstThreeBytes !== "\xef\xbb\xbf") {
rewind($fh);
}
while (($row = fgetcsv($fh, 10000, ',')) !== false) {
// 您的程式碼寫在這裡
}

這基本上是讀取 3 個位元組並檢查它們是否相符

https://en.wikipedia.org/wiki/Byte_order_mark  如果您正在處理其他編碼頁面，這裡有更多資訊

＋新增筆記