• No results found

Pointer Size Environment

In document dtj v08 02 1996 pdf (Page 74-77)

A central goal in the implementation of 64-bit addressing on the OpenVMS operating system was to provide upward-compatible support for applications that use the existing 32-bit address space. Another guiding principle was that mixed pointer sizes are likely to be the rule rather than the exception for applications that use 64-bit address space. These factors d rove several key design decisions in the OpenVMS Calling Stan­ dard and programming interfaces, the DEC C language su pport , and the system services support. For example, self-identifying 64-bit descri ptors were designed to ease development when mixed pointer sizes are used. DEC C sup­ port makes it easy to mix pointer sizes and to recom pile for uniform 32- or 64-bit poi nter sizes. OpenVMS system services remain fully upward compatible, with new services defined only where requ ired or to enhance the usability of the

huge 64-bit address space. This paper describes the approaches taken to support the mixed pointer size environment in these areas. The issues and rationale beh ind these OpenVI\IIS and DEC C solutions are presented to encourage others who provide library i nterfaces to use a consistent programming interface approach.

Digital Tcclmiul )ounul Vol . � No. 2 [ ')')(,

I

Thomas R. Benson

Karen L. Noel

Richard E. Peterson

Suppmt t(>r 64-bir \'irtual addressi ng on the OpcnVt\!!S Alph;l opcr;ning s\·srcm, \'Crsion 7.0, h:1s \·asril· inc1·clscd the ;lmmmr oh inu:1l :-�ddrcss space ;l\'<li lablc t(x :-�pplica­ tion usc.' At the same rime, li.i llv compatible support tor applic:-�tions that usc onlv 32-bit add resses ( ;1lso ullcd pointers) has been prcscnnL

A n applicuion that mi\cs 3 2 - bi r and 64-bir pointer sizes operates i n :t mi.\·ed pointer size ent·imnnwnl. tVli\cd poi mc1· size appl ications \\'ere the design center tc>r the initi<l i i mplcmcnt:-�tion o f 64-bit support in the OpcnVMS opcr<Hing svsrcm . This paper d iscusses the I'Clsons ,,·Jw mi\ing poi nter sizes is expected to be a common prxticc ;md dcsni bcs the design of opcr:tting S\'Stcm and language featu res that JI'C pro­ ,·idcd to usc progr:-�mming in this mi\cd poi nter size cm·ironm en t.

Reasons for Mixed Pointer Sizes

To usc 64-hir :tdd rcss space, some simple appl ications need only be recompi l ed t(Jr :t u n i f(>nl1 64- b i t poi nter size . For c x;lmplc, scl fcont a i ncd DEC: C appl ic:1rions rhar rcl v on onlv the C: run -rime libr.1ry, without using s\·srcm services or other l i braries, can take this ;lpproac h . Real - world :-�pplications arc seldom this clc:tn-cut, hm\'C\'Cr. In more complc\ appl ications, where 64-bit :tddrcss space is l i ke l v to be needed, mi xes of langu:-�gcs, dependencies on svstcm intcrt:lces a nd other l ibraries, :-�nd rel iance on third -part\1 pack­ ages or l ibraries ;lre com mon . These practices a l l lead to the m i\cd poi nter size environment in which appli­ cnions continue ro usc some 3 2 - bi t add resses while taking adv:-�nr:-�gc of64-bit virtu:-�! address space .

Appl ic:-�tions that arc l i kely to rake advantage of 64- bit m emory arc those i n which the dcclar;�rion and management of ;l L1rgc data set u n be logica l l v sepa­ rated ti·om the rest of the progr:-�111 . This scpar;nion docs not need to be ar the sou rce file Jc,·cl. Ir em be at a progr.1111 r1ow leve l , indic;�ting which i n ternal and external intc rhccs wi l l be gi\'Cil 64- bi r ;ldd rcsses to work wi rh .

'I'hc t<>l lowing sections explore the reasons t(>r mixing poi nter sizes.

Open VMS and Language Support

Implement3tion choices that Digital made tor this first release of the Open VMS operating system that sup­ ports 64-bit virtual addressing will probably encour­ age mixed pointer size programming. These choices were d riven largely by the need tor absolute upward com pati bility tor nisting programs and the goal of supporting large, dynamic data sets as the primary applicltion t()r 64-bit addressing.

Dynamic Data Only OpcnVMS services support dynamic allocation ot-64-bit address space. This mech­ anism most closely resembles the malloc and free fimc­ tions t()r al locating and deal locaring dynamic storage in the C programming language. Allocation of this type d i ffers trom static and stack storage in that e xplicit source statements are required to manage it. For static and stack storage, the system is allocating the memory on behalf of the application at image activation rime. (Of course, the al location may be extended during execution in the case ofst:�ck stor:�ge. ) This allocation cont i n ues co be fi·om 32-bit addrcss:�ble space.

Two special cases of static allocation are worth men­ tioning. Lin kage sections, which are program sections that contain routine l i n kage i n formation , and code sections, which contJin the cxccutJbk instructions, do not difkr substantially ti·om preinitialized static storage. As J result, these sections also reside only in 32-bit addressable memory.

U pward-compatibil ity Constraints The OpenVMS Alpha operating system is cautious to avoid using 64- bit memory freely where it may prevent upward

compatibility tor 32-bit :�pplications. For example, the linkage section m ight seem to be a nJtural candidate for the Open VMS system to allocate automatically in 64-bit memory. This allocnion woul d essentially free more 32-bit add ressable memory for application use; however, even if this were done only tor applications relin ked tor new versions of the Open VMS operating system , there is no guarantee that all object code treats

J in kagc section addresses as 64 bits in width. A simple example is storing the address of a routine in a struc­ ture. Since a routine's address is the add ress of its pro­ cedure descriptor in the linkage section, moving the lin kage section to 64- bit memory wouJd cause code that stores this add ress i n a 32-bi t cel l to fail.

Allocating the user stack in 64- bit space also appears to be a good opportunity to easily i ncrease the amount of memory available to an application. Stack add resses arc often more visi ble to application code than linkage section addresses arc. For instance, a rou tine can easily

allocate a local variable using temporary storage on the

stack and pass the address of the variable to another

routine. I f the stack is moved to 64-bit space, this

address quietly becomes a 64-bit address. If the cal led

routine is not 64-bit capable, attempts to use the address will fai l.

Focus on Services Req u i red for Large Data Sets Not all system services could be changed to support 64- bit addresses ( i . e . ,

promoted)

in time tor the first version of the OpenVMS operating system to support 64-bit addressing. vVith the mixed-poi n ter model in mind , we tocused on those services that were l i kely to be required for large data sets. For example, to allow IjO directly to and from high memory, it was essential that the IjO queui ng service, SYS$QIO, accept a 64-bit buffer address. Conversely, the SYS$TRNLNM service

t(x translating a logical name did not need to be mod­

i ticd to accept 64-bit addresses. Its arguments include a logical name, a table name, and a vector that contains requests tor information about the name. These are small data elements that arc u n likely to req uire 64-bit addressing on their own . Of course, they may be part of some larger structure that resides in 64-bit space. In this case, they can easily be copied to or from 32-bit addressable memory.

System services are d iscussed further in the section Open VMS System Services. The 32-bit ad d ress restric­ tion on certain system services again emphasizes the

importance of being able to logically separate large

data set support from the rest of an application. Limited Language Support Another interface point that requires care when using 64-bit addressing is at cal ls between mod ules written in d i fferent program­ m ing languages. The Open VMS Calling Standard tradi tionally makes it easy to mix .languages in an appli­ cation, but DEC C is the only high-level language to fully support 64-bit add resses in the tirst 64-bit­ capable version of the Open VMS operating system . 2

The usc of 64-bit add resses i n mixed - language applications is possible, and data that contains 64- bi t add resses may even b e shared; however, references that actually use the data pointed to by these addresses need to be limited to DEC C code or assembl y lan­ guage . Mixed high-level language applications arc cer­ tain to be mixed pointer size appl ications in this version of the operating syste m .

Support for 32-bit Libraries

Many applications rely on l i brary packages to provide some aspect of their functionality. Typical examples

include user interface packages, graphics libraries, and database utilities. Third -party libraries may or may not support 64- bit addresses. Applications that usc these

libraries will probably mix 32-bit and 64-bit poi nter sizes and will therefore require an operating system that supports m ixed pointer sizes.

74

Implications of Full 64-bit Conversion

for some applications, i t may be desirable to mix pointer sizes to avoid the side dkcts of universal 64-bit address conversion. The approach of recompiling every­ thing with 64-bit address widths is sometimes cal led "throwing the switch." An obvious implication of throwing the switch is that all pointer data doubles in size. For complex linked data structures, this can be a signi ficant overal l increase in size. I ncreasing the pointer size may also reveal hidden dependencies on pointer size being the same as integer size. If code accesses a cel l as both a 32-bit i nteger and a 32-bit pointer, the code wil l n o longer work if the pointer i s enlarged . Thus, univerS<llly i ncreasing the pointer size may torce changes to code that would othen,�se continue to work.

There is a more compelling reason t()r not throwing the switch tor code that is part of a shared l ibrary. Library packages must not retu rn 64-bit add resses to users of the library unless tbe call i ng code is ddinitely 64-bit capable. If the li brary developer throws the switch when building a library written in DEC C, all memory returned by the malloc function wil l be i n 64-bit address space. This can be a problem i f the add ress is blind ly returned to a library caller. If a library is to work in a m i xed pointer size environ ment, and i t sometimes returns pointers to memory it has al lo­ cated, it needs to use mixed pointer sizes internally.

Programming Interface Issues

The coexistence of 32-bit and 64-bit poimcrs raised several design questions tor operating system and l�m­ guage support, particularl y in the area of routine i n ter­ faces. "When an application or li brary is being mod ified to use 64-bit address space, argument passing may be the most exposed area. In this section, we d escri be how mixed pointer size support affects argument­ passing mechanisms and the design decisions made to case the coexistence of mixed pointer sizes.

Argument List Width

Even bd(>re the introduction of64-bit add ressing, the Open VMS Cal ling Standard defined argu ment l ist cle­ ments to be 64 bits in width. When passing a 32 -bit address ( that is, when passing an item in 3 2 - bi t space by rckrcncc ), compilers sign extend the 3 2 - bi t val u e into the 64- bit argu ment location . ' Passing 64- bit addresses as val ues works transparently without chang­ ing the calling standard , assuming, of course, that the called routine expects to receive 64- bi t add resses. Passing 32 - bit addresses as values to rou tines that expect 64-bit addresses works properly bec1use the values have been sign extended to a 64-bit width. Pointers by Reference

P�1ssing the add resses of pointers requires special care when mixing pointer sizes. If the caller passes a 32- bit Vol . ll No. 2 I Y96

address by rckrcncc, and the cal led routine reads it as a 64-bit address tl·om mcmon·, the upper 32 bits wil l be incorrect. Similarly, if the address of a 64- bit add ress is passed , and the called rou tine reads only 32 bits ri-om memory, it wi ll bil when that address is used .

This is the simplest case in which support of 64-bit addresses may require a programming interface change tor 64- bit callers. A single ent r y point that receives a pointer by rdercnce cannot tell which size pointer

it has received . Some possible solutions include a new alternate entry point t(Jr 64- bit-capable cal lers or a new parameter indicating the size of the address.

Pointers Embedded in Structures

Pointers passed by reference are a special case of the more general problem of passing structures that con­ tain pointers. Again, the caller and called rou tine must agree on t he size of the pointers contained in the structure . This case ofters an option that may not req u ire a new programming int erface, however. If the structure is self-identifying, the routine rnay be able to tell which rcm11 ofrhe structure it has received and dis­

patch to appropriate code ror the corresponding pointer length.

Function Return Values

Function return values are also defined to be 64 bits i n width, s o n o cal ling standard change was requ i red to

support 64-bit poi nter retu rns. I t is important that ::t 64- bi t address not be retu rned blind lv, though, u n less it is known that the caller is 64-bit capable. Typically, this is a problem f()r library support routines rather than t()r those within an application . A li brary rou tine should return a 64- bit address only if the routine has been specifically developed tor a 64- bit environment or if it can tell with certainty, based on input parame­ ters received, thJt the cdlcr is 64-bit capable.

Calling Standard Issues

The Open VMS Calling St�111dard defines register usage

conventions, argu ment l ist locations, data structures, and standard practices r(>r making procedure calls that

operate correctly in a multilanguage and mul ti ­

threaded environment. As memioned earlier, this stan­ dard a l ready ddincd :1rgument list elements to be 64 bits in width; however, some key data structures ddined by the standard were based on 32 -bit pointer sizes. The goal of upward comp::tti bility tor existing code complicated the job of extending the standard . The rollowing sections describe bow the Structures were u l timatelv changed and i l lustrate some

approaches to supporting mi xed pointer sizes w hen

shared structures conrai n poi ntcrs.

Descri ptors Descriptors �1rc structures defined bv the call ing standard to specit-\' an argument's tvpc, length, and address, along with other tvpe or

structure-specific information . Typical ly, descriptors are used only tc>r character strings, arrays, and complex data types such as packed decimal .

Descriptor types are by ddinition sclfidcntit)ring by virtue of the type and class fields they contain. An obvious choice, therd(xe, for extending descriptors to

handle 64-bir add resses wou ld be to �1dd new type

constants t(>r 64- bit data elements and extend the structure beyond the type fields to accommodate larger addresses and sizes. I n practice, h owever, the address and Jength fields hom descriptors are fi-e­

quently used without accessing the type fields, partic­

u larly when a character string descriptor is expected .

As a resu lt, a solution was sought that wou ld yield a predictable fai lure, rather than i ncorrect results or data corru ption, when a 64- bit descriptor is received by a routine that expects onlv the 32-bit f(>rm. The final design includes a separate 64-bit descriptor layout that contains two special fields at the same ottSets as the length and add ress fields in the 32- bit descriptor. These fields are called M BO ( must be one) and M RMO ( must be minus one), respectively. The sim ­ plest versions of t h e 3 2 - bit a n d 64- bit descriptors are i l l ustrated in Figure l .

If <1 routine that expects a 32-bit descriptor receives

a 64- bit descriptor, it will find rhe value 1 in the length ticld . This nonzero val ue ensures th:tt the address will need to be read . Otherwise, the descriptor cou ld be treated as describing a n u l l value, and the address

In document dtj v08 02 1996 pdf (Page 74-77)