I have a file containing 800 lines like:
id binary-coded-info
---------------------------
4657 001001101
4789 110111111
etc.
where each 0 or 1 stands for the presence of some feature. I want to read this file and do several bitwise logical operations on the binary-coded-info (the operations depend on user input and on info from a second file with 3000 lines). Then, these re-calculated binary-coded-info should be written to file (with trailing zeros, e.g.
4657 000110011
4789 110110000
etc.
How should I do this without writing my own base conversion routine? I am open for anything, also languages I do not know, like python, perl, etc. And it should work without compiling.
So far, I tried to script, awk and sed my way. This would mean (I think): batch read as base-2, convert to base-10, do bitwise operations depending on user input and second file, convert to base-2, add leading zeros and print. The usual console hints to use bc do not seem elegant because I have many lines in a file. The same holds for dc.sed. And awk doesnt seem to have an equivalent to flagging input as binary ( as in "echo $((2#101010))" ), and also, the printf trick doesn't work for binary. So, how would I do this most elegantly (or, at all, for that matter) ?
-
In C, you can use "strtol(str, NULL, 2)" to do the conversion, if you're already doing this in C.
Something like the following would work:
FILE* f = fopen("myfile.txt", "r"); char line[1024]; while ((line = fgets(line, sizeof(line), f)) { char* p; long column1 = strtol(line, &p, 10); long column2 = strtol(p, &p, 2); ... }You'll need to add error-handling, etc.
-
Why convert them and use bit operations?
In Python, you can do all of this as a string.
for line in myFile: key, value = line.split() bits = list(value) # bits will be a list of 1-char strings ['1','0','1',...] # ... do stuff to bits ... print key, "".join( value )FogleBird : You can also convert the items in bits to bools with: bits = [n!='0' for n in bits] This way you can just say if bits[2]:... -
In python, you can convert to binary using int, specifying base 2. ie:
>>> int('110111111',2) 447To convert back, there is a
binfunction in python2.6 or 3, but not in python2.5, so you'd need to implement it yourself (or use something like the below):def bin(x, width): return ''.join(str((x>>i)&1) for i in xrange(width))[::-1] >>> bin(447, 9) 110111111(The width is the number of digits to pad to - your examples seem to be using 9-bit numbers.)
-
"Simple" Perl one liner (replace foo bar baz quux with your flags
perl -le '@f=qw/foo bar baz quux/;$_&&print($f[$i]),$i++for split//, shift' 1011Here is a readable Perl version:
#!/usr/bin/perl use strict; use warnings; #flags that can be turned on and off, the first #flag is turned on/off by the left-most bit my @flags = ( "flag one", "flag two", "flag three", "flag four", "flag five", "flag six", "flag seven", "flag eight", ); #turn the command line argument into individual #ones and zeros my @bits = split //, shift; #loop through the bits printing the flag that #goes with the bit if it is 1 my $i = 0; for my $bit (@bits) { if ($bit) { print "$flags[$i]\n"; } $i++; } -
Expanding on Brian's answer:
# Get rid of the '----' line for simplicity data_file = '''id binary-coded-info 4657 001001101 4789 110111111 ''' import cStringIO import csv, sys data = [] # A list for the row dictionaries (we could just as easily have used the regular reader method) # Read in the file using the csv module # Each row will be a dictionary with keys corresponding to the first row reader = csv.DictReader(cStringIO.StringIO(data_file), delimiter=' ', skipinitialspace = True) try: for row in reader: data.append(row) # Add the row dictionary to the data list except csv.Error, e: sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e)) # Do something with the bits first = int(data[0]['binary-coded-info'],2) # First bit string assert(first & int('00001101',2) == int('1101',2)) # Bitwise AND assert(first | int('00001101',2) == int('1001101',2)) # Bitwise OR assert(first ^ int('00001101',2) == int('1000000',2)) # Bitwise XOR assert(~first == int('110110010',2)) # Binary Ones Complement assert(first << 2 == int('100110100',2)) # Binary Left Shift assert(first >> 2 == int('000010011',2)) # Binary Right ShiftSee the python docs on expressions for more information and csv module for more information on the csv module.
-
Hi, thanks for all of your answers! I started learning python and will hopefully look into perl alongside. This was a great starting point! - Matt
0 comments:
Post a Comment