HP OpenVMS Systems

ask the wizard
Content starts here

Carriage Returns and RMS File Conversions?

» close window

The Question is:

Dear Wizard,
I've got a sequential file which I need to remove the carriage returns from
at the end of each line. I've tried converting the file with an fdl to have
no carraige return but it makes no difference. I know the carraige returns
can be removed on UNIX using
 chomp? How can I do this on VMS ? The file is 200,000 blocks in size.

The Answer is :

  This depends on the current format of the file.  The OpenVMS Wizard
  assumes that the reason for the CR characters is that this is a STREAM
  file copied from Microsoft MS-DOS or Microsoft Windows system, as this
  is a common reason for seeing apparently extraneous CR characters embedded
  within a file.
  RMS recognises three types of stream files:
      1) STREAM_LF - in which records are delimited by an LF character
      2) STREAM_CR - in which records are delimited by a CR character
      3) STREAM    - in which records are delimited by an LF character,
    		     a CR+LF character pair, or an FF or VT character
  Often, text files from MS-DOS or Windows systems will have records ending
  with a CR+LF pair. When such a file is copied onto a VMS system as a
  STREAM_LF file, the CR character becomes part of the data stream and
  therefore will appear at the end of each record.
  You can check if your file falls into this category with the following
  two commands:
    	$ DIRECTORY/FULL filespec
  Check that the Record Format is Stream_LF:
    	Record format:      Stream_LF, maximum 0 bytes, longest 0 bytes
  and that the records contain a CR+LF pair. Ensure you dump sufficient
  blocks to see the ends of a number of records
    	$ DUMP/BLOCK=COUNT:1 filespec
  31310962 65462031 300A0D38 312E3009 .0.18..01 Feb.11 000020
  0A0D is a CR+LF pair (remember that the hex dump reads right to left!).
  If your file satisfies BOTH these conditions, you have two choices for
  removing the CR from your data. The first doesn't actually remove the
  character, it just tells RMS that the CR is part of the record
    	$ SET FILE/ATTRIBUTE=(RFM=STM) filespec
  Note that this does not involve any conversion or copying of data. The
  DIRECTORY/FULL command will now display the record format as:
    	Record format:      Stream, maximum 0 bytes, longest 0 bytes
  and applications reading the file will no longer "see" the embedded CR
  If you really must physically remove the CR character, you can now do
  so with a simple CONVERT command:
    	$ SET FILE/ATTRIBUTE=(RFM=STM) filespec
    	$ CONVERT/FDL=SYS$INPUT filespec newfilespec
  The first command tells RMS that the record relimiter is CR+LF as before.
  The second performs a conversion of the file to STREAM_LF format, so when
  the new file is created, records will be delimited by a single LF character.
  If your file is NOT a STREAM_LF file, the above will not work. You can
  either write a program to remove the CR character, use PERL or similar
  tool, or use a text editor.  For example, using EVE, the following
  keystrokes will remove all "visible" CR characters from any file (though
  with a file the size of yours it might take a while!)
    $ EDIT/TPU filespec
    Prompt			Keystroke(s)	Explanation
    none			<DO>		Enter command mode
    Command:			REPLACE<CR>     Enter the REPLACE command
    Old String:			<CTRL V>	Used to enter control characters
    Press the key to be added:	<CR>		Enter CR as data
    				<CR>		Terminate the old string
    New String: 		<CR>		Terminate the new string
    Replace? Type Yes, No, All, Last, or Quit:
    				A<CR>		Replace all instances
    				<CTRL Z>	Write new file and exit
  Here is a DCL procedure which will remove ONE CR character from each
  record in a sequential file and produce a new file (with VFC format
    $ IF p1.EQS."" THEN INQUIRE p1 "Input file"
    $ IF p2.EQS."" THEN INQUIRE p2 "Output file"
    $ OPEN/READ in 'p1'
    $ OPEN/WRITE out 'p2'
    $ cr[0,8]=13    ! CR character
    $ loop: READ/END=Cleanup in line
    $   line=line-cr
    $   WRITE out line
    $ GOTO loop
    $ Cleanup:
    $   CLOSE in
    $   CLOSE out
    $ EXIT
  If you require a non-VFC file, use CREATE/FDL, COPY NLA0: filename,
  or other tool to create a non-VFC sequential file format, then use
  OPEN/APPEND on the file.  (The DCL OPEN command defaults to VFC.)

answer written or last revised on ( 25-FEB-2000 )

» close window