Programmers wishing to write NDIS protocol drivers for Windows NT have long used the Packet sample in the NT DDK as a learning tool. Some programmers have used the Packet sample as an armature to build their drivers around, adding code but leaving the structure intact. Unfortunately, the sample has bugs. This page attempts to collect warnings about the dangers lurking in the Packet sample. This information in this page is offered in the spirit of cooperation among programmers. The author hopes that he will be contacted by those who find additional bugs in the Packet sample or wish to correct inaccuracies in this page.
Eliyas Yakub of Microsoft has reworked the NT 4.0 Packet sample and he believes he has addressed all the bugs reported on this page. You can download the updated sample from MS Support.
Eliyas reports on his work:
I think I took care of all the bugs. I took out the cleanup routine because it was completely wrong. In the first place, it's not a good idea to reset an adapter just because an application doesn't want to use the adapter any more. Resets are expensive operation. An adapter is not a resource of one application or protocol in the system. Secondly, there was a race condition between the protocol and miniport during freeing of packet. There is no reliable way to do this in the existing architecture. There are plans to introduce new APIs to cancel a pending Recv.
I also don't like the idea of using FsContext to save the BindApapterContext. This is not generally recommended. I modified it in the W2K version but I left it as is on NT4.0.
I would like to hear about anyone's experience with the new version of the sample.
We won't list as bugs the design choices in the Packet sample that could have been done better and probably would have been done better in a production driver unless those choices harm the robustness or throughput of the driver. Under this heading come:
IoCompleteRequest(). This happens in
PacketOpen()does with its stack variable Medium when calling
PacketCleanup()two calls are made to
IoMarkIrpPending()against the Cleanup IRP FlushIrp.
PacketCleanup()an attempt is made to go from a receive packet's list entry in the packet's
ProtocolReservedto the packet itself with the use of one invocation of
CONTAINING_RECORD. This wrongly telescopes the 2 enclosing levels into one, which happens to work just because the
PACKET_RESERVEDcomes first and hence has the same address as
PacketReserved. If the order of structure members in
PACKET_RESERVEDwere changed, the driver would blow up if there were any pending reads at cleanup time. The correct way to achieve this is to invoke
CONTAINING_RECORDtwice: the first time to get a pointer to the reserved area of the packet and the second time to get a pointer to the packet itself. The correct way can be seen in
PacketReceiveIndicate(), by the way. (Thanks to Alf Ludwig for clarifications on this item.)
PacketIoControl()there's a typo in which
IOCTL_PROTOCOL_SET_OIDis tested for twice, the test results being ORed together; one of the tests should have been for
IOCTL_PROTOCOL_QUERY_OID. Because of this the driver will never ask the netcard driver for the value of an OID. (Thanks to Victor Ishikeev for noticing this bug.)
PacketReceiveIndicatehas no use for a packet it returns
NDIS_STATUS_SUCCESSwhen it should return
NDIS_STATUS_NOT_ACCEPTED. According to the chapter on NDIS Driver Lower-Edge Functions in the DDK's NetworkReference, this will slow down overall network throughput. Presumably this would be especially serious in a multi-protocol setting.
WriteFile()wrote zero bytes: Due to a bug in PacketSendComplete() noticed by Skip Hanson, Irp->IoStatus.Information is zeroed. Skip recommends
NdisQueryPacket(pPacket, NULL, NULL, NULL, &(pIrp->IoStatus.Information))to fix this. In my NT protocol driver I use MmGetMdlByteCount() to show the calling app the number of bytes sent by the app (as opposed to the grand total including the framing added by the driver.) You can see this in BgpsSendCompleteHandler() in ntbgps.c in the billgPC zipfile.
But why is this so? This is a point that isn't covered adequately
by what might be called the
standard sources (the DDK's
documentation and Art Baker's The Windows NT Device Driver
Book), so we'll try to be thorough here. In what follows, we
are indebted to Jamie Hanrahan's postings in
comp.os.ms-windows.programmer.nt.kernel-mode as well as
the article on IRP cancel operations in the November-December 1997
issue of The NT Insider. (Responsibility for any
mistakes lies of course with this document's author.)
When all goes well, a driver's cleanup routine will be entered when a process's connection to the device's file object is severed. There are three ways to begin the process of severing the process's connection to the file object:
For the Packet sample's cleanup processing to be executed, it is not enough to begin the process of disconnecting the process from the file object. The cleanup routine will only be entered if all references to the corresponding file object have been eliminated. An IRP created by an I/O request that has not been completed has just such a reference. And the only way NT provides for a driver to clear a thread's references when Win32 tries to disconnect the file object is through IRP cancel routines.
The likely consequences of an application severing its connection to the Packet sample driver with an outstanding I/O (probably a read) include the following:
Rick Rigby was kind enough to contribute a solution to the I/O cancel bug. You can find his code, together with some commentary warning of a danger in it, here.
Another solution to the cancel problem which complements Rigby's but has its own flaws, is in the NT driver for billgPC, the free bootp client by this page's author. One flaw is that the use of the global cancel spinlock where a private spinlock could improve overall system performance. (The normal lifespan of the driver, and the application it supports, is only a few seconds.) The second flaw is a matter of debate. In the first of the two illuminating articles on I/O cancel operations in OSR's The NT Insider the reader is strongly advised to ignore the cancel routine's IRP pointer argument and search the driver's own queue for the first IRP with a cancel flag set. This, the article claims, is the one to dequeue and cancel; the reasons for this are promised in the second installment. Unfortunately, the second part of the article lacks an explanation for this algorithm, and billgPC's driver does heed the IRP pointer argument. To see billgPC's implementation, search for BgpsCancelReadIrp in ntbgps.c in the billgPC zipfile.