awk with dates before 1970

from https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html I understand that gawk has only 2 functions to work on date/time mktime and strftime .

So, I can parse any date using mktime that return a long, so I can make any math op, and so I can format the output desired with strftime

This works like a charm for any date after "1970 01 01 00 00 00"

Using awk, how can I format dates before 1970 ?

$ awk 'BEGIN{t=mktime("1970 01 01 00 00 00"); print t; print strftime("%Y-%m-%d", t) }'
10800
1970-01-01
$ awk 'BEGIN{t=mktime("1960 01 01 00 00 00"); print t; print strftime("%Y-%m-%d", t) }'
-315608400
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: strftime: second argument less than 0 or too big for time_t

Unfortunately, as you've seen, gawk just can't do this directly. The gawk manual says:

All known POSIX-compliant systems support timestamps from 0 through 2^31 - 1, which is sufficient to represent times through 2038-01-19 03:14:07 UTC. Many systems support a wider range of timestamps, including negative timestamps that represent times before the epoch.

The manual doesn't say what strftime() does if given an out-of-range date.

But even on my system, which does behave sensibly for negative time_t values, gawk's strftime() function doesn't support them (though mktime() does), and so can't handle dates before 1970. I consider this to be a bug in gawk.

(My advice would be to use Perl instead of Awk, but that doesn't answer the question you asked.)

In principle, you could reinvent the wheel by re-implementing a function like strftime() in awk. But that would be overkill.

If your system has a working GNU coreutils date command, you can invoke it from gawk . Using your example of Jan 1, 1960:

$ cat 1960.awk
#!/usr/bin/awk -f

BEGIN {
    timestamp = mktime("1960 00 00 00 00 00")
    print "mktime() returned " timestamp

    if (0) {
        # This doesn't work
        s = strftime("%Y-%m-%d %H:%M:%S", timestamp)
        print "strftime() returned ", s
    }
    else {
        # This works
        "date '+%Y-%m-%d %H:%M:%S' -d @" timestamp | getline t
        print "The date command printed "" t """
    }
}
$ ./1960.awk
mktime() returned -318355200
The date command printed "1959-11-30 00:00:00"
$

(I gave up on figuring out the sequence of quotes and backslashes needed to do this as a one-liner from a shell prompt.)

This probably makes sense if you have a large existing awk program and you need to add this one feature to it. But if you're not stuck with doing this in awk, you might consider using something else; awk may not be the right tool for what you're trying to accomplish.

Or, if you're really ambitious, you could modify the gawk sources to handle this case correctly.


So, it's a bug ...

I'm using GNU awk 4.0.2, a litte look at source and seems easy to fix:

glaudiston:/sources/gawk-4.0.2$ diff builtin.c.orig builtin.c
1701,1702c1701,1702
<                       if (clock_val < 0)
<                               fatal(_("strftime: second argument less than 0 or too big for time_t"));
---
>                       // if (clock_val < 0)
>                       //      fatal(_("strftime: second argument less than 0 or too big for time_t"));
glaudiston:/sources/gawk-4.0.2$ echo "" | ./gawk '{ts="1969 12 31 23 00 00";format="%Y/%m/%d";tv=mktime(ts);print tv;print strftime(format, tv)}'
7200
1969/12/31
glaudiston:/sources/gawk-4.0.2$ echo "" | ./gawk '{ts="1960 01 01 00 00 00";format="%Y/%m/%d";tv=mktime(ts);print tv;print strftime(format, tv)}'
-315608400
1960/01/01

For my purpose it worked, but I'm not sure if it was a good idea. I'll send this to gawk maillist for approval.

Discussion started at: https://lists.gnu.org/archive/html/bug-gawk/2015-04/msg00012.html

Solution Update:

The awk dev team has fixed the bug, so just upgrade your awk to a new version:

https://lists.gnu.org/archive/html/bug-gawk/2015-04/msg00036.html

链接地址: http://www.djcxy.com/p/84374.html

上一篇: 如何在R中的函数体中设置断点

下一篇: awk与1970年以前的日期