Pandas.dataframe.read_excel reads excel date format

use Pandas to analyze an excel data, which contains the excel date field date , stored values such as 2018-12-31 , and use pd.read_excel to read all excel data. Since the values in the table are entered artificially, in order to avoid potential reading errors, the coverters= {"date":str} of read_excel is specially set. But the final dataframe output date is: "2018-12-31 00:00:00" such a value, there is no intact storage of the value in the excel, is there any way to make the date stored in the original text?

Jul.05,2022

Old friend, you can use datetime library

import datetime

def time_convert(time_str):
    time_obj = datetime.datetime.strptime(str(time_str), '%Y-%m-%d %H:%M:%S')
    time_converted = time_obj.strftime('%Y/%m/%d')
    return time_converted 
    
dataset[''] = dataset[''].apply( time_convert )
< hr > < H1 > changed at 12:27 on 2019.2.24 < / H1 >

simply take a look at the official API documentation:
there are three places you can try:

< H2 > -.dtype parameter < / H2 >

dtype: Type name or dict of column-> type, default None

Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32}
Use `object` to preserve data as stored in Excel and not interpret dtype.
If converters are specified, they will be applied INSTEAD of dtype conversion.

dtype: a dictionary of type name or (column-> type) that defaults to data types without
data or columns, such as {'axiaxiaxiaoguo np.int32}
using object to save data stored in Excel, rather than interpreting dtype
if converters are specified, they will be applied to the INSTEAD of dtype transformations.

and converters parameters:

converters: dict, default None

  Dict of functions for converting values in certain columns. Keys can
  either be integers or column labels, values are functions that take one
  input argument, the Excel cell content, and return the transformed
  content.

converter: type dict , default no
requires a dictionary of functions used to convert values in certain columns. The
key can be an integer or column label, and the value is a function with one input parameter, Excel the cell contents, and returns the converted content.

so:
you can set it in read_excel () : dtype= {'data':str} instead of the parameter convert
tried, no solution.

< H2 > looks like an interpreter. By default, dateutil.parser.parser is used to parse your time, and pd.Timestamp object is returned. The default is xxxx-xx-xx xx:xx:xx . You can rewrite the _ str__ () and _ repr__ () functions of pd.Timestamp . < H2 > II. Parse _ dates parameter < / H2 >

parse_dates: bool, list-like, or dict, default False

The behavior is as follows:

* bool. If True -> try parsing the index.
* list of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
* list of lists. e.g.  If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
* dict, e.g. {{'foo' : [1, 3]}} -> parse columns 1, 3 as date and call result 'foo'

 If a column or index contains an unparseable date, the entire column or
 index will be returned unaltered as an object data type. For non-standard
 datetime parsing, use `pd.to_datetime` after `pd.read_csv`

Let's talk about the last paragraph:

if a column or index contains an unresolvable date, the entire column or index is returned unchanged as an object data type. For non-standard date-time parsing, use pd.to_datetime () after pd.read_csv ()

after reading the last time-related parameter, I seem to have got through.

< H2 > III. Date _ parser parameter < / H2 >

date_parser: function, optional
Function to use for converting a sequence of string columns to an array of datetime instances.
The default uses dateutil.parser.parser to do the conversion.
Pandas will try to call date_parser in three different ways, advancing to the next if an exception occurs:

 1) Pass one or more arrays (as defined by `parse_dates`) as arguments; 
 2) concatenate (row-wise) the string values from the columns defined by `parse_dates` into a single array and pass that; 
 3) call `date_parser` once for each row using one or more strings (corresponding to the columns defined by `parse_dates`) as arguments.

Translation:

date_parser: function, optional
this parameter requires a function to convert a sequence of string columns into a date-time instance array.
defaults to dateutil.parser.parser for conversion.
Pandas will try to call date_parser in three different ways, and if an exception occurs, proceed to the next one:

1) pass one or more arrays (defined by parse_dates ) as parameters;
2) concatenate (line by line) string values from the columns defined by parse_dates into an array and pass them;
3) call date_parser once for each row using one or more strings (corresponding to the columns defined by parse_dates ) as parameters.

that is, unless you create a function that does not change the time string , it will definitely parse your time string into a parsable format (that is, as you can see "2018-12-31 00:00:00" ),

what you are asking is:

Why is date not stored as text

in fact, pandas is used for data analysis. Time has to be converted into a format that can be recognized by a specific computer before the computer can help us identify and analyze. If it is stored in text, it takes up a lot of memory resources, and the computer cannot analyze and recognize it.

on the other hand, if you just save the time in text format, use excel , but use pandas to feel overqualified

.
Menu