Inline Assembly/Examples
What follows is a collection of Inline Assembly functions so common that they should be useful to most OS developers using GCC. Other compilers may have intrinsic alternatives (see references). Notice how these functions are implemented using GNU extensions to the C language and that particular keywords may cause you trouble if you disable GNU extensions. You can still use the disabled keywords such as asm if you instead use the alternate keywords in the reserved namespace such as __asm__. Be wary of getting inline assembly just right: The compiler doesn't understand the assembly it emits and can potentially cause rare nasty bugs if you lie to the compiler.
WARNING!
Be aware, asm(""); (no constaints, known as basic asm) and asm("":); (empty constraints, a form of extended asm) are not the same! Among other differences, in basic asm you must prefix registers with a single percentage sign '%', in extended asm (even if the list of constraints is empty) you must use '%%' as a prefix in the assembly template string. If you get gcc error messages like "Error: bad register name '%%eax'", then you should insert a colon (':') before the closing parenthesis. Extended asm is generally preferred, though there are cases where basic asm is necessary . Also note that this register prefix only applies to AT&T syntax assembly, gcc's default.
Memory access
FAR_PEEKx
使用默认C数据段以外的另一段读取给定内存位置的8/16/32位值。 不幸的是,直接操作段寄存器没有限制, 因此,需要手动发布mov<reg>,<segmentreg>。
static inline uint32_t farpeekl(uint16_t sel, void* off)
{
uint32_t ret;
asm ( "push %%fs\n\t"
"mov %1, %%fs\n\t"
"mov %%fs:(%2), %0\n\t"
"pop %%fs"
: "=r"(ret) : "g"(sel), "r"(off) );
return ret;
}
FAR_POKEx
将8/16/32位值写入段:偏移地址。请注意,与farpeek非常相似,farpoke的此版本保存并恢复用于访问的段寄存器。
static inline void farpokeb(uint16_t sel, void* off, uint8_t v)
{
asm ( "push %%fs\n\t"
"mov %0, %%fs\n\t"
"movb %2, %%fs:(%1)\n\t"
"pop %%fs"
: : "g"(sel), "r"(off), "r"(v) );
/* TODO: Should "memory" be in the clobber list here? */
}
I/O access
OUTx
在I/O位置发送8/16/32位值。 传统名称分别为outb、outu和outl。 a修饰符强制在发出asm命令之前将val置于eax寄存器中,Nd允许将一个字节的常量值组合为常量,从而释放edx寄存器用于其他情况。
static inline void outb(uint16_t port, uint8_t val)
{
asm volatile ( "outb %0, %1" : : "a"(val), "Nd"(port) );
/* There's an outb %al, $imm8 encoding, for compile-time constant port numbers that fit in 8b. (N constraint).
* Wider immediate constants would be truncated at assemble-time (e.g. "i" constraint).
* The outb %al, %dx encoding is the only option for all other cases.
* %1 expands to %dx because port is a uint16_t. %w1 could be used if we had the port number a wider C type */
}
INx
从I/O位置接收8/16/32位值。传统名称分别为inb、inw和inl。
static inline uint8_t inb(uint16_t port)
{
uint8_t ret;
asm volatile ( "inb %1, %0"
: "=a"(ret)
: "Nd"(port) );
return ret;
}
IO_WAIT
等待很短的时间(通常为1到4微秒)。 用于在旧硬件上实现PIC重新映射的小延迟,或通常作为简单但不精确的等待。
您可以在任何未使用的端口上执行IO操作:Linux内核默认使用端口0x80,该端口通常在POST期间用于在主板的十六进制显示上记录信息,但在引导后几乎总是未使用。
static inline void io_wait(void)
{
outb(0x80, 0);
}
中断相关功能
Enabled?
如果为CPU启用irq,则返回一个true布尔值。
static inline bool are_interrupts_enabled()
{
unsigned long flags;
asm volatile ( "pushf\n\t"
"pop %0"
: "=g"(flags) );
return flags & (1 << 9);
}
推/弹出中断标志
有时,禁用中断,然后仅当中断以前被禁用时才重新启用它们是有帮助的。 尽管上述功能可用于此目的,但无论IF的状态设置如何,以下功能都以相同的方式进行:
static inline unsigned long save_irqdisable(void)
{
unsigned long flags;
asm volatile ("pushf\n\tcli\n\tpop %0" : "=r"(flags) : : "memory");
return flags;
}
static inline void irqrestore(unsigned long flags)
{
asm ("push %0\n\tpopf" : : "rm"(flags) : "memory","cc");
}
static void intended_usage(void)
{
unsigned long f = save_irqdisable();
do_whatever_without_irqs();
irqrestore(f);
}
Memory clobber强制排序,cc clobber,因为在第二个示例中,所有条件代码都被覆盖。 在第二种情况下没有易失性,因为它没有输出,因此总是易失性的。
LIDT
定义一个新的中断表。
static inline void lidt(void* base, uint16_t size)
{ // This function works in 32 and 64bit mode
struct {
uint16_t length;
void* base;
} __attribute__((packed)) IDTR = { size, base };
asm ( "lidt %0" : : "m"(IDTR) ); // let the compiler choose an addressing mode
}
CPU相关功能
CPUID
请求CPU标识。有关更多信息,请参见CPUID。
/* GCC has a <cpuid.h> header you should use instead of this. */
static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
}
RDTSC
读取CPU时间戳计数器的当前值并存储到EDX:EAX中。 时间戳计数器包含自上次CPU重置以来经过的时钟滴答数。 该值存储在64位MSR中,并在每个时钟周期后递增。
static inline uint64_t rdtsc()
{
uint64_t ret;
asm volatile ( "rdtsc" : "=A"(ret) );
return ret;
}
这可以用来找出执行某些功能所需的时间,对于测试/基准测试等非常有用。 注意:这只是一个近似值。
在x86_64上,“A”约束期望写入“rdx:rax”寄存器,而不是“edx:eax”。 因此,GCC实际上可以通过根本不设置“rdx”来优化上述代码。 相反,您需要使用位移位手动执行此操作:
uint64_t rdtsc(void)
{
uint32_t low, high;
asm volatile("rdtsc":"=a"(low),"=d"(high));
return ((uint64_t)high << 32) | low;
}
Read GCC Inline Assembly Machine Constraints for more details.
READ_CRx
读取控制寄存器中的值。
static inline unsigned long read_cr0(void)
{
unsigned long val;
asm volatile ( "mov %%cr0, %0" : "=r"(val) );
return val;
}
INVLPG
使一个特定虚拟地址的TLB(转换查找缓冲区)无效。 页面的下一个内存引用将被强制从主内存重新读取PDE和PTE。 必须在每次更新其中一个表时发出。 m指针指向逻辑地址,而不是物理或虚拟地址:ds段的偏移量。
static inline void invlpg(void* m)
{
/* Clobber memory to avoid optimizer re-ordering access before invlpg, which may cause nasty bugs. */
asm volatile ( "invlpg (%0)" : : "b"(m) : "memory" );
}
WRMSR
Write a 64-bit value to a MSR. The A constraint stands for concatenation of registers EAX and EDX.
static inline void wrmsr(uint32_t msr_id, uint64_t msr_value)
{
asm volatile ( "wrmsr" : : "c" (msr_id), "A" (msr_value) );
}
On x86_64, this needs to be:
static inline void wrmsr(uint64_t msr, uint64_t value)
{
uint32_t low = value & 0xFFFFFFFF;
uint32_t high = value >> 32;
asm volatile (
"wrmsr"
:
: "c"(msr), "a"(low), "d"(high)
);
}
RDMSR
从MSR读取64位值。 The A constraint stands for concatenation of registers EAX and EDX.
static inline uint64_t rdmsr(uint32_t msr_id)
{
uint64_t msr_value;
asm volatile ( "rdmsr" : "=A" (msr_value) : "c" (msr_id) );
return msr_value;
}
在x86_64上,这需要:
static inline uint64_t rdmsr(uint64_t msr)
{
uint32_t low, high;
asm volatile (
"rdmsr"
: "=a"(low), "=d"(high)
: "c"(msr)
);
return ((uint64_t)high << 32) | low;
}